[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20050613: 'exited with status 127' in log file



Mark,

If you are not rotating your log files, they will build up in 
size.

I provide an example script to rotate your logs, in 
$NAWIPS/bin/scripts/dcrotatelog.csh
which can be copied to the LDM user's ~ldm/util directory
and run from the LDM users crontab as:
# rotate GEMPAK logs
0 17 * * * util/dcrotatelog.csh >/dev/null 2>&1

The script may have to be modified if your LDMHOME is not ~ldm,
or your Gemenviron file is not linked from ~gempak/Gemenviron.

The script uses the LDM distribution's ~ldm/bin/newlog command to
to rotate the log files, keeping 2 days by default.

As for what program is generating the exit 127 status,
you can grep the $GEMDATA/log/*.log files for the process number of
a decoder that matches the child process id when you see an exit
if it is a GEMPAK decoder. Otherwise, you can issue a "kill -USR2"
to the pqact process to enter into verbose logging to your ldmd.log 
file, and see what action is being acted on when the child exits.
(issue a second kill -USR2 to cycle to debug, and a 3rd to go
back to normal logging).

The GEMPAK decoders wil stay running as long as the pipe they are running 
on is not closed by the LDM, and they have gotten data withinin the past 10 
minutes.
For the surface decoders such as dcmetr, dcmsfc, dctaf, it is very likely
that some bulletin will come accross at least every 10 minutes (the decoders 
have a -t
flag to modify the timeout period. Since the decoders open and close their 
files based
on the data they are decoding, it is a good thing to keep the decoder running.
Those decoders that have been running since June 3 are
definitely not the ones that exited with status 127.

Steve Chiswell
Unidata User Support




>From: Mark Seefeldt <address@hidden>
>Organization: UCAR/Unidata
>Keywords: 200506131811.j5DIBdZu020195

>
>Hi!  My ldmd.log files have been littered lately with entries such as 
>the following:
>   Jun 13 17:07:47 foehn pqact[4348]: child 7475 exited with status 127
>I am getting about 20 such errors an hour.
>
>I have made an unsuccessful initial attempt to identify the process 
>which is creating the entries, but have not been able to do so.
>
>I am including below the list of current processes running on foehn by ldm:
>[ldm@foehn logs]$ ps -fu ldm
>UID  PID  PPID  C STIME TTY TIME CMD
>ldm  4345    1  0 Jun03 ?   00:00:00 rpc.ldmd -q /usr/local/ldm/data/
>ldm  4347 4345  0 Jun03 ?   01:10:07 pqact
>ldm  4348 4345  2 Jun03 ?   06:37:25 pqact etc/pqact.gempak
>ldm  4349 4345  0 Jun03 ?   00:52:24 rtstats -h rtstats.unidata.ucar.
>ldm  4350 4345  0 Jun03 ?   00:42:43 rpc.ldmd -q /usr/local/ldm/data/
>ldm  4351 4345  0 Jun03 ?   00:17:20 rpc.ldmd -q /usr/local/ldm/data/
>ldm  4352 4345  0 Jun03 ?   00:06:48 rpc.ldmd -q /usr/local/ldm/data/
>ldm  4353 4345  0 Jun03 ?   00:05:52 rpc.ldmdusr -q //local/ldm/data/
>ldm  4354 4345  0 Jun03 ?   00:40:30 rpc.ldmd -q /usr/local/ldm/data/
>ldm  4355 4345  0 Jun03 ?   00:00:34 rpc.ldmdusr -q //local/ldm/data/
>ldm  4358 4348  0 Jun03 ?   00:12:43 decoders/dcmetr -v 2 -a 500 -m 7
>ldm  4367  4348  0 Jun03 ?  00:25:29 decoders/dcmsfc -b 9 -a 10000 -d
>ldm  4374  4348  0 Jun03 ?  00:12:42 decoders/dctaf -d data/gempak/lo
>ldm  9601 4348  0 Jun06 ?   00:00:40 decoders/dcmsfc -d data/gempak/l
>ldm 20572  4348  0 Jun09 ?  00:00:45 decoders/dcuair -b 24 -m 16 -d d
>ldm 12828  4348  0 Jun11 ?  00:01:50 decoders/dcacft -e GEMTBL=/home/
>ldm 14467  4348  0 Jun12 ?  00:01:25 decoders/dcgrib2 -d data/gempak/
>ldm 26484  4348  0 05:42 ?  00:00:14 decoders/dclsfc -v 2 -s lsystns.
>ldm  3820  4348  0 09:49 ?  00:00:00 decoders/dcisig -e GEMTBL=/home/
>ldm  4978  4976  0 10:10 pts/1    00:00:00 -bash
>ldm  6081  6080  0 10:33 pts/3    00:00:00 -bash
>ldm  6899  4348  0 10:52 ?  00:00:01 decoders/dcrdf -v 4 -d data/gemp
>ldm  7067  4348  0 10:57 ?  00:00:03 decoders/dcgrib2 -d data/gempak/
>ldm  7647  7646  0 11:11 pts/4    00:00:00 -bash
>ldm  7755  4978  0 11:13 pts/1    00:00:00 ps -fu ldm
>
>I can understand the pqact processes running from the last time that I 
>started the ldm, but it seems odd to me that some of the decoders are 
>still running since a few days to a week after the ldm was started.
>
>In a related, or possibly unrelated, problem I have noticed that I am 
>creating very large log files in the $GEMDATA/logs directory.  For 
>example the dcmetr .log file is up over 240 MB and the dcrdf.log file is 
>over 13 MB.  Am I correct in concluding this is not normal?
>
>I would appreciate any suggestions as to what I should do from this 
>point.  Let me know if there is any additional information which you 
>could use.
>
>Thanks
>
>Mark Seefeldt
>
--
NOTE: All email exchanges with Unidata User Support are recorded in the
Unidata inquiry tracking system and then made publicly available
through the web.  If you do not want to have your interactions made
available in this way, you must let us know in each email you send to us.