[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20041022: ldmd.conf error messages



>From: Robert Dewey <address@hidden>
>Organization: ?
>Keywords: 200410220550.i9M5oYvV023578 GEMPAK

Robert,

>I keep getting the following error messages in my ldmd.log file:
>
>Oct 22 01:46:02 LDM pqact[3874]: pbuf_flush 14: time elapsed   2.195818
>Oct 22 02:20:19 LDM pqact[3874]: child 7190 exited with status 1
>Oct 22 02:20:24 LDM pqact[3874]: pbuf_flush 13: time elapsed   2.651371
>Oct 22 04:21:01 LDM pqact[3874]: pbuf_flush (7) write: Broken pipe
>Oct 22 04:21:01 LDM pqact[3874]: pipe_dbufput: 
>decoders/dcgrib2-v1-ddata/gempak/logs/dcgrib.log-eGEMTBL=/home/gempak/GEMPAK5.
> 7.2p2/gempak/tables 
>write error

This indicates that the LDM process 'pqact' couldn't write the full
product it was trying to process to the decoders/dcgrib2 action
in your ~ldm/etc/pqact.conf file.  There are several reasons why the
write might fail:

- the /home/gempak/GEMPAK5.7.2p2/gempak/tables directory is not readable
  by the user running your LDM

- the directory into which dcgrib2 is trying to write its output is not
  writable by the user running your LDM

- decoders/dcgrib2 is not executable by the user running the LDM (one
  example of this is it can't be found)

I have seen several instances where the GEMPAK HOME directory has not
had read permissions that allowed the user running the LDM to read
the GEMPAK table needed by GEMPAK decoders, so I would look there first.

 ...
>write error
>Oct 22 05:15:44 LDM pqact[3874]: pipe_prodput: trying again
>Oct 22 05:15:44 LDM pqact[3874]: child 9493 terminated by signal 11

'pqact' will retry a write to a decoder exactly once.  If the second
write fails, it will exit.

>I then looked up the dcgrib.log file, to see what the output in there 
>was for the times listed above:
>
>0515Z Entries:
>[9493] 041022/0515 [DCGRIB -53] no file template [7.0 125 253]
>[9493] 041022/0515 [DCGRIB -53] no file template [7.0 125 253]
>[9493] 041022/0515 [DCGRIB -53] no file template [7.0 125 253]


This looks like the user running the LDM can't read/find the
template file specified in the pqact.conf entry.

>I seem to get alot of "grid too large/bulletin too long", "no file 
>template", and "no command line arguments found" errors in dcgrib.log.
>
>Everything seems to be running smoothly as far as data goes, but the 
>error messages worry me that the LDM could crash at any moment.

The messages you are seeing are from GEMPAK decoders, not the LDM.

>Our 
>machine was down for about 15 days due to a CAT 5 wire problem. Before 
>the CAT 5 problem, the machine ran for 35 days straight without any 
>snags, and the ldmd.conf file never got bigger than 8KB, meaning there 
>was very few error messages (other than the occasional user starting and 
>stopping the LDM on the upstream machine). Nothing has changed except 
>for the CAT 5 wire, and it's brand new (along with the errors)... If you 
>could shed some light on this, I would be greatful...

Did you upgrade GEMPAK after the outage?

Cheers,

Tom Yoksas
--
NOTE: All email exchanges with Unidata User Support are recorded in the
Unidata inquiry tracking system and then made publicly available
through the web.  If you do not want to have your interactions made
available in this way, you must let us know in each email you send to us.