[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[IDD #BIG-478502]: LDM not starting at SIO

Hi Mary,

> Nice to see it has been more than 6 months since I last bothered you :)
> Makes me feel less guilty!


> I had an e-mail this afternoon that our obs file was all missings. So, after
> a little inspecting I see that indeed, we stopped getting obs between 6 and
> 7 this morning (Wednesday, August 25th).


> I'm not sure if this means I need to restart ldm ? or ? It seems it's been
> since last year that we rebooted the machine (linux workstation). Seems
> overdo for such a thing.

I would suspect something else...

> I'm a little concerned because we had a DNS change
> yesterday (one of our UCSD servers was shut down). Our sys admin updated the
> resolv.conf file and firewall. And of course we did get data after this
> change -- so can't say they are directly related (but close enough I thought
> worth a mention).

Ah Ha!  I am willing to bet that the problem is related to name resolution
on your system, aeolus.ucsd.edu.

(Few minutes later)
Yup, the problem appears to be name resolution on your machine.  Here are
representative messages from the ~ldm/logs/ldmd.log file:

Aug 26 13:40:38 aeolus idd.unidata.ucar.edu[3445] WARN: Couldn't resolve 
"idd.unidata.ucar.edu" to an Internet address in 40.078 seconds 
Aug 26 13:40:38 aeolus idd.unidata.ucar.edu[3445] ERROR: Disconnecting due to 
LDM failure; Couldn't get IP address of host idd.unidata.ucar.edu; try again 
Since your machine can't resolve idd.unidata.ucar.edu into an IP address, the 
does not know who to contact for the ~ldm/etc/ldmd.conf REQUEST(s).  The strange
thing about this is that I could manually resolve the IP address of 

[ldm@aeolus ~]$ nslookup idd.unidata.ucar.edu

Non-authoritative answer:
Name:   idd.unidata.ucar.edu

It is most likely that the LDM needed to be stopped/restarted after the
DNS/resolv.conf change made yesterday.  I did the stop and start:

<as 'ldm'>
ldmadmin stop
-- needed to manually kill 'rtstats'

The LDM started before I could run 'ldmadmin start'.  Have you instrumented
aeolus to check to see if the LDM is running and restart it if it is not?

After the LDM restart, the data appears to be coming in:

Aug 26 13:50:25 aeolus idd.unidata.ucar.edu[26074] NOTE: Upstream LDM-6 on idd.u
nidata.ucar.edu is willing to be a primary feeder 
Aug 26 13:50:26 aeolus atmos.ucsd.edu(feed)[26075] NOTE: Starting Up(6.7.0/6): 2
0100826125025.056 TS_ENDT {{NXRDSRC|NEXRAD2|NEXRAD3|NPORT|NIMAGE|WMO,  ".*"}}, S
IG=724b9d003258b305bb40fe532ba98ca9, Primary 

> I don't know the magic of ldm and how data gets from that feed into the
> files that are created and distributed through the /weather mount on aeolus
> (our ldm machine). But, specifically, I'm looking at metar data not coming
> in which is stored at: /weather/observations/metar
> (especially files 10082500.metar -- looks good; got obs and
> 10082507.metar -- too small (and no files for
> many hours Aug 25th)

No data was flowing since the LDM could not do name resolution for the
upstream feed host.

> Maybe this is TMI :)
> Please let me know what you think!
> As always, appreciate your help!

I believe that things are working now.  Please keep an eye on things and
let us know if you are still seeing problems.


Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
Unidata HomePage                       http://www.unidata.ucar.edu

Ticket Details
Ticket ID: BIG-478502
Department: Support IDD
Priority: Normal
Status: Closed