[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20030919: Two LDM's not talking to each other



Alan,

>Date: Fri, 19 Sep 2003 09:15:03 -0400
>From: "Alan Hall" <address@hidden>
>Organization: NOAA
>To: Steve Emmerson <address@hidden>
>Subject: Re: 20030911: Two LDM's not talking to each other

The above message contained the following:

> Still having a problem with this LDM and I'm not getting anywhere with
> my system folks.

Do they indicate that anything has changed regarding firewalls between
Reflect and Doppler?

> I have upgraded the LDM server to 6.0.14 (source
> installation) and are still having the same problem. One slight
> difference, The logfile included here is trying to connect to another
> local ldm (humboldt) that doesn't go thru any firewalls.  Other local
> LDM's are able to connect to the humboldt ldm without error.  Any info
> you can arm me with for my system folks would be a great help.

What is the output of the following command executed on Doppler,
Reflect, and Humbolt?

    uname -a

Did Reflect and Hubolt ever work together using an earlier version of
LDM 6?

> Here is the latest debug log from this problem system:
> 
> Sep 19 12:53:13 doppler rpc.ldmd[103288]: Starting Up (version: 6.0.14; 
> built: Sep 18 2003 16:01:22)
> Sep 19 12:53:13 doppler rpc.ldmd[103288]: main(): Opening product-queue
> Sep 19 12:53:13 doppler rpc.ldmd[103288]: main(): Creating service portal
> Sep 19 12:53:13 doppler rpc.ldmd[103288]: create_ldm_tcp_svc(): Checking for 
> another LDM
> Sep 19 12:53:13 doppler rpc.ldmd[103288]: create_ldm_tcp_svc(): Getting TCP 
> socket
> Sep 19 12:53:13 doppler rpc.ldmd[103288]: create_ldm_tcp_svc(): Eliminating 
> EADDRINUSE problem.
> Sep 19 12:53:13 doppler rpc.ldmd[103288]: create_ldm_tcp_svc(): Getting root 
> privs
> Sep 19 12:53:13 doppler rpc.ldmd[103288]: create_ldm_tcp_svc(): Binding socket
> Sep 19 12:53:13 doppler rpc.ldmd[103288]: create_ldm_tcp_svc(): Calling 
> getsockname()
> Sep 19 12:53:13 doppler rpc.ldmd[103288]: port 388
> Sep 19 12:53:13 doppler rpc.ldmd[103288]: create_ldm_tcp_svc(): Calling 
> listen()
> Sep 19 12:53:13 doppler rpc.ldmd[103288]: create_ldm_tcp_svc(): Checking 
> portmapper
> Sep 19 12:53:13 doppler rpc.ldmd[103288]: create_ldm_tcp_svc(): Registering
> Sep 19 12:53:13 doppler rpc.ldmd[103288]: create_ldm_tcp_svc(): Releasing 
> root privs
> Sep 19 12:53:13 doppler rpc.ldmd[103288]: tcp sock: 0
> Sep 19 12:53:13 doppler rpc.ldmd[103288]: main(): Reading configuration-file
> Sep 19 12:53:13 doppler rpc.ldmd[103288]: main(): Serving socket
> Sep 19 12:53:13 doppler 192.67.134.137[38840]: Starting Up(6.0.14): 
> 192.67.134.137: TS_ZERO TS_ENDT {{FSL5,  ".*"}}

The last line aboe indicates that Doppler can't resolve the IP
address 192.67.134.137 into a fully-qualified hostname.

I can resolve that IP address here:

    nslookup 192.67.134.137
    Server:         128.117.140.62
    Address:        128.117.140.62#53

    Non-authoritative answer:
    137.134.67.192.in-addr.arpa     name = humboldt.ncdc.noaa.gov.

    Authoritative answers can be found from:
    134.67.192.in-addr.arpa nameserver = serns.noaa.gov.
    134.67.192.in-addr.arpa nameserver = l3.ncdc.noaa.gov.
    134.67.192.in-addr.arpa nameserver = l4.ncdc.noaa.gov.
    134.67.192.in-addr.arpa nameserver = l5.ncdc.noaa.gov.
    134.67.192.in-addr.arpa nameserver = l6.ncdc.noaa.gov.
    134.67.192.in-addr.arpa nameserver = merns.noaa.gov.
    134.67.192.in-addr.arpa nameserver = mwrns.noaa.gov.
    134.67.192.in-addr.arpa nameserver = nwrns.noaa.gov.
    l3.ncdc.noaa.gov        internet address = 192.67.134.94
    l4.ncdc.noaa.gov        internet address = 192.67.134.95
    l5.ncdc.noaa.gov        internet address = 205.167.25.35
    l6.ncdc.noaa.gov        internet address = 205.167.25.36

Why can't Doppler do the same?

Do your systems people indicate that anything has changed regarding
hostname resolution?

> Sep 19 12:53:13 doppler 192.67.134.137[38840]: Delay: 580812.2104 sec
> Sep 19 12:53:13 doppler 192.67.134.137[38840]: pq_sequence(): 
> time(insert)-time(create): 4621.4423 s
> Sep 19 12:53:13 doppler 192.67.134.137[38840]: cursor reset: stop searching
> Sep 19 12:53:13 doppler 192.67.134.137[38840]: Desired product class: 
> 20030919115313.485 TS_ENDT {{FSL5,  ".*"}}
> Sep 19 12:53:13 doppler 192.67.134.137[38840]: Connected to upstream LDM-6
> Sep 19 12:53:13 doppler 192.67.134.137[38840]: requester6.c:274: Calling 
> feedme_6(...)
> Sep 19 12:53:13 doppler 192.67.134.137[38840]: Upstream LDM is willing to feed
> Sep 19 12:53:13 doppler 192.67.134.137[38840]: requester6.c:524: Calling 
> run_service()
> Sep 19 12:53:13 doppler 192.67.134.137[38840]: requester6.c:187: Downstream 
> LDM initialized
> Sep 19 12:53:24 doppler 192.67.134.137[38840]: ERROR: requester6.c:206: 
> Connection to upstream LDM closed
> Sep 19 12:53:24 doppler 192.67.134.137[38840]: Sleeping 30 seconds before 
> retrying...

The above is consistent with your previous log messages and my
hypothesis and, unfortunately, doesn't add anything new.  I suspect that
the LDM logfile on Humbolt contains something like the following:

    Sep 19 12:53:24 humbolt doppler(feed)[26833]: topo:  doppler.ncdc.noaa.gov 
WSI
    Sep 19 12:53:24 humbolt doppler(feed)[26833]: up6.c:168: HEREIS: RPC: 
Unable to send; errno = Broken pipe
    Sep 19 12:53:24 humbolt doppler(feed)[26833]: up6.c:369: Product send 
failure: Input/output error
    Sep 19 12:53:24 humbolt rpc.ldmd[19596]: child 26833 exited with status 6

Were you able to successfully execute some other client software utility
on Reflect that connects to server software on Doppler (e.g., ssh(1),
telnet(1), rlogin(1))?

Regards,
Steve Emmerson