[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

19990701: The CONDUIT data feed



Celia,

The "not enough space" message seems to indicate that the LDM
tried to increase the queue size, and failed. In general, you
should start the queue as large as you expect to need it.
On IRIX, the LDM will try to increase the queue size if needed-
but you can run into conflicts with pqexpire. This can corrupt the queue
and kill the LDM.

If pqexpire dies, then the queue would not be scoured and would
keep growing out of control.  I run my IRIX64 machine
queue size with ldmadmin specifying 800mb. My statistics show
that the high water mark of this queue is currently about 625Mb.
The queue has been serving the NMC2 feed for many months without 
rebuilding.

If you have shorter than normal MRF files, then check the LDM
logs for RECLASS messages. This would be a sign of latencies
greater than 1 hour and losing data. The other common occurence
is for the Cray at NCEP to crach and files are late or scrubbed.

Steve Chiswell
Unidta User Support.




>From: address@hidden (Celia Chen)
>Organization: .
>Keywords: 199907011817.MAA21465

>I started receving the MRF grid #3 data on iita2.rap.ucar.edu
>on 6/24/99 and feeding WITI on 6/25/99.  It looks like 
>the data was coming in normally for a few days.  We have just
>noticed that some data files camee in on 6/28 and 6/29 are much
>smaller than normal size. Then iita2 stopped saving data on the
>disk during 6/29 while WITI was able to continue archiving data
>until today. (See below)
>
>------------------
>/iita/data/ldm/mrf
>
>-rw-rw-r--    1 ldm      ldm      23534364 Jun 27 02:33 99062700132_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      23374262 Jun 27 02:36 99062700144_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      23473400 Jun 27 02:39 99062700156_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      23492248 Jun 27 02:42 99062700168_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      23232058 Jun 27 01:38 9906270024_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      23332132 Jun 27 01:43 9906270036_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      23315028 Jun 27 01:47 9906270048_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      23323330 Jun 27 01:57 9906270060_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      23264214 Jun 27 02:02 9906270072_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      23431280 Jun 27 02:16 9906270084_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      23394558 Jun 27 02:22 9906270096_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      19546092 Jun 28 01:50 9906280000_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      9342436 Jun 28 02:02 99062800108_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      9352474 Jun 28 02:07 99062800120_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      23330756 Jun 28 01:54 9906280012_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      9368596 Jun 28 02:10 99062800132_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      9283910 Jun 28 02:13 99062800144_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      9398850 Jun 28 02:14 99062800156_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      11024120 Jun 28 02:58 99062800168_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      23236234 Jun 28 01:58 9906280024_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      21756684 Jun 28 02:02 9906280036_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      23275368 Jun 28 02:07 9906280048_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      9348664 Jun 28 01:40 9906280060_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      9340868 Jun 28 01:45 9906280072_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      9365856 Jun 28 01:56 9906280084_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      9324932 Jun 28 01:58 9906280096_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      8163116 Jun 29 01:26 9906290000_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      9327876 Jun 29 01:29 9906290012_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      15553538 Jun 29 01:53 9906290024_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      17078040 Jun 29 01:55 9906290036_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      9364992 Jun 29 01:42 9906290048_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      9364588 Jun 29 01:53 9906290060_PGrbF.mrf
>-rw-rw-r--    1 ldm      ldm      5715248 Jun 29 01:55 9906290072_PGrbF.mrf
>------------------
>
>There is this "Not enough space" message on pqact.log:
>
>------------------
Jun 24 22:35:54 pqact[2235]: Starting Up
>Jun 29 07:57:18 pqact[2235]: mmap: 18040000 0 1744732160: Not enough space
>Jun 29 07:57:18 pqact[2235]: Remap failed. Abandon all hope.
>Jun 29 07:57:18 pqact[2235]: pq_sequence failed: Not enough space (errno = 12)
>Jun 29 07:57:18 pqact[2235]: Exiting
>------------------
>
>It looks like there is enough disk space to store the MRF data on
>iita2 at this point:
>
>-----------
>iita2|22|% df /iita
>Filesystem             Type  blocks     use     avail  %use Mounted on
>/dev/dsk/xlv/xlviita     xfs 14163224 11989424  2173800  85  /iita
>
>-----------
>What could be the cause of the problems we see here? Please advise.
>
>Thanks.
>
>Celia
>~
>-
>