[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20020312: thelma has gone south



>From: Unidata User Support <address@hidden>
>Organization: Unidata Program Center/UCAR
>Keywords:

Mike,

Thelma stopped feeding atm the FNEXRAD feed yesterday around 19 Z.
Running a notifyme on atm to thelma this morning produces:

% notifyme -vxl- -f FNEXRAD -o 3600 -p pnga2area -h thelma.ucar.edu
Mar 12 14:39:53 notifyme[14143]: Starting Up: thelma.ucar.edu: 
20020312133953.349 TS_ENDT {{FNEXRAD,  "pnga2area"}}
        NOTIFYME(thelma.ucar.edu) returns OK
Mar 12 14:39:53 notifyme[14143]: NOTIFYME(thelma.ucar.edu): OK
Mar 12 14:39:53 notifyme[14143]: Connection reset by peer
Mar 12 14:39:53 notifyme[14143]: Disconnect

Attempts to rlogin to thelma from laraine (as root) get no where.  It
appears that thelma is once again out in never-never land (sigh).

Tom

>From: "Mike Schmidt" <address@hidden>
>Date: Tue, 12 Mar 2002 08:15:02 -0700

Tom,

I have an active login to thelma and it is responsive via ssh and rlogin
from root on laraine.  I'm not seeing an OS issue right now, do we need
to restart the LDM?

mike

>From address@hidden Tue Mar 12 08:24:31 2002

Mike,

re: thelma gone south

>I have an active login to thelma and it is responsive via ssh and rlogin
>from root on laraine.  I'm not seeing an OS issue right now, do we need
>to restart the LDM?

Right after reading this, I once again did the 'su -' on laraine and
rlogin to thelma.  This time it works!?

As ldm on thelma, a notifyme to myself doesn't work:

/local/ldm% notifyme -vxl- -f FNEXRAD -o 3600
Mar 12 15:17:41 notifyme[23780]: Starting Up: localhost: 20020312141741.571 
TS_ENDT {{FNEXRAD,  ".*"}}
        NOTIFYME(localhost) returns OK
Mar 12 15:17:41 notifyme[23780]: NOTIFYME(localhost): OK
Mar 12 15:17:41 notifyme[23780]: Connection reset by peer
Mar 12 15:17:41 notifyme[23780]: Disconnect

I can't explain this, but will try shutting down the LDM and then restarting
it.

Right after running 'ldmadmin stop' I tried running 'top' and it just hangs.
Also, I can't kill the 'top' attempt with CTRL-C or put it in the background
with CTRL-Z.

This is exactly like the situation a couple of weeks ago with the exception
that I was able to rlogin as root from laraine.

Can you verify that you see the same thing when attempting to run top,
etc.?  If the answer is that you do see the same things, then a reboot
seems in order to me unless you know of something else to try.

Tom

>From address@hidden Tue Mar 12 08:41:44 2002

Mike,

So, I see that the LDM has been restarted on thelma, and I now have
control of my session again and can run top.  What happened?

Tom

>From address@hidden Tue Mar 12 08:44:56 2002

Tom,

Operating system wise, things were working fine for me.  The LDM was not
running when I looked, so I restarted it...

mike

>From address@hidden Tue Mar 12 08:47:51 2002

Tom,

I see a series of the following in the ldmd.log.1 on thelma;

Mar 12 00:00:05 thelma ldm(feed)[22099]: topo:  ldm.comet.ucar.edu NNEXRAD|GEM
Mar 12 00:00:05 thelma ldm(feed)[22099]: mmap: 0 0 1980907520: Not enough space
Mar 12 00:00:05 thelma ldm(feed)[22099]: pq_open failed:
      /usr/local/ldm/data/ldm.pq: Not enough space
Mar 12 00:00:05 thelma ldm(feed)[22099]: Exiting

mike