[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20020423: SSEC Unidata machine down



>From: "Kevin Polston" <address@hidden>
>Organization: NOAA/NWS
>Keywords: 200204230009.g3N09Ia00698 IDD Unidata-Wisconsin

Kevin,

re: SSEC Unidata-Wisconsin machine down

>Does this mean all of the servers won't work?

No, not at all.

>The reason I ask is
>this....the change you made the other day for me to access data through
>atm.geo.nsf.gov has worked perfectly. Very timely, no problems in
>getting the data, etc. Papagayo still seems a little slow sometimes even
>without getting the satellite data.

It just happened that someone from UNL was here in my office on Friday.
We talked about networking at UNL, and he commented that he had noticed
that the network there seemed to be heavily congested during the afternoons
typically from Tuesday - Thursday.

>However....I am now having a problem where my ldm won't stay up and running.
>I am wondering if this is related to the unidata server being down.

This is not caused by the lack of Unidata-Wisconsin data.

>I am hoping this is the case
>because as I said....ldm will not stay running. I am running a notifyme
>to atm.geo.nsf.gov and it is scrolling products by however. So that has
>me a little concerned. I re-booted my machine thinking that might help
>clear out the problem but it did not and I'm not sure it would make a
>difference.  I've sent my last two log files for you to take a look at. 
>Looking for your enlightenment.

A quick look at the first ldm log file you sent tells me that you may
need to stop your ldm and then delete and then remake your queue.  Why
this may be needed is not immediately obvious.

The steps are:

o stop your LDM:

  ldmadmin stop

o wait until all LDM processes exit

o delete and remake the queue:

  ldmadmin delqueue
  ldmadmin mkqueue

o restart your LDM:

  ldmadmin start

>From address@hidden Mon Apr 22 18:20:55 2002

Tom,

>I just tried running the ldm again and here is what happened. Ldm
>started but when it started to request something from papagayo then it
>got this strange response (if I remember correctly the same thing
>happened when it tried to request something from atm.geo.nsf.gov). Here
>is the log.

The file you attached was your ~ldm/etc/ldmd.conf file, not the log file.

>Right after that happened then the ldm shutdown.  If this is
>related to the other server going down I understand....but if it's not
>then what the heck happened?

I suspect that the queue got corrupted somehow.

>I haven't touched anything all day on this
>machine.  Unfortunately I think it is something that has happened to me
>since I just tried pinging both locations and I am getting a response.

Getting data from multiple locations is a well used feature of the LDM.
In fact, it is a rare site that only feeds from one upstream host.

>I wish I knew how to tell you to get into my machine. Hope the log files
>help you out.

Try remaking the queue as I indicate above.

Also, can you provide an explaination why your machine's address is a
".com" (mkc-65-30-96-123.kc.rr.com) and not a ".gov"?  I did a lookup
on rr.com and find that it is an enterprise that typically sevices the
residental market through cable modems:

http://www.rr.com/rdrun/

"Road Runner is... a high speed, online service that provides
lightening-fast access to the Internet as well as to unique broadband
content and services. Road Runner is delivered to your computer over
the same upgraded cable systems that currently bring cable television
into your home. Cutting-edge technology advancements, implemented over
comprehensive regional and national networks get you to your Internet
destinations in seconds. "

Tom