[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20050307: LDM status on papagayo

>From: Unidata User Support <address@hidden>
>Organization: Unidata Program Center/UCAR
>Keywords: 200503070449.j274n2v2001291 IDD LDM pqcat

Hi Clint,

After getting the attached email from Pete Pokrandt tonight, I logged
onto papagayo and found that the LDM was not running.  Since the queue
was corrupted by an apparent reboot yesterday, I deleted, remade it,
and then restarted the LDM.  Please see below for details.

I notice that the reason that the LDM did not come up after the reboot
yesterday was that the queue check action:

pqcat -s -l /dev/null

was and still is hung, and is chewing up CPU cycles:

load averages:  4.20,  4.27,  3.04                                 02:10:05
131 processes: 127 sleeping, 1 zombie, 3 on cpu
CPU states:     % idle,     % user,     % kernel,     % iowait,     % swap
Memory: 4096M real, 2651M free, 483M swap in use, 4743M swap free

   936 root       1  10    0    0K    0K cpu/0   33.4H 21.63% pqcat

I am unable to kill it since it is being run by 'root'.  Please
kill this as soon as you get a chance.

I noticed that the LDM queue is too small (400M) to hold even a small
fraction of an hour's worth of data given the volume that papagayo is
ingesting.  Since you have enough space in /data, I took the liberty of
increasing the queue size to 2GB when I restarted the LDM:

<as 'ldm'>
cd ~ldm/etc
-- edit ldmadmin-pl.conf and change $pq_size from "400M" to "2G"
cd ~ldm
ldmadmin delqueue
ldmadmin mkqueue -f
ldmadmin start

Here is the message Pete sent earlier tonight:

  From address@hidden  Sun Mar  6 21:49:02 2005
  We feed NIMAGE data from papagayo.unl.edu (I don't have the
  contact for them available..) We haven't seen any NIMAGE data
  since about 22 UTC Friday March 5.  I just noticed now, and 
  flipped over to feed from idd.unidata.ucar.edu (to f5.aos.wisc.edu)
  until we figure out what's up.