[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #SWT-100004]: pqact decoding jobs won't quit



HI Brian,

re: 
> I continue to setup the new Solaris10 machine for ldm ingest and decoding
> (thus all my emails lately). The ingest is working, and even the decoding
> it appears, but I am getting a ton of jobs that just don't want to die
> (listed in ps -ef: please see below). Some of these are my own personal
> perl scripts for sao decoding, but there are some gempak thrown in as
> well. I did not have this problem on my other SUN running Solaris8, and
> the only other difference is that I am running with the latest LDM and
> decoder packages. If I run on the command line, job finish ok, so
> this is tough to debug. If I stop the ldm, all the processes go away,
> so they must be tied to the pqact somehow. For example, a sample pqact is
> also below. Please let me know if you know what can cause this type of
> behavior. The machine has 2Gb of RAM and 3Gb of swap. My only other choice
> is to stop/start the ldm every day to clean.
> 
> WMO     (^S[IM]V[IGNS])|(^SNV[INS])|(^S[IMN](W[KZ]|[^VW]))
> PIPE    /data1/ldm/NAWIPS-5.9.3/os/sol/bin/dclsfc
> -d /data1/ldm/logs/dclsfc.log
> -e GEMTBL=/data1/ldm/NAWIPS-5.9.3/gempak/tables
> data/gempak/surface/YYYYMMDD_syn.gem
> 
> WMO     ^SAUS.* K([^W].|W[^B]). ([0-3][0-9])([0-2][0-9])([0-5][0-9])
> PIPE    saous.ldm.dec
> data/surface/decoded/saous/(\2:yyyy)(\2:mm)\2\3.saous
> 

The GEMPAK dclsfc decoder is supposed to continue running so that new
data PIPEd into it gets decoded with the same invocation.

As far as your own Perl-based decoder, if its logic is to exit after all
of the STDIN data has been read, you could add the '-close' option to
your PIPE:

WMO     ^SAUS.* K([^W].|W[^B]). ([0-3][0-9])([0-2][0-9])([0-5][0-9])
    PIPE    -close  saous.ldm.dec
    data/surface/decoded/saous/(\2:yyyy)(\2:mm)\2\3.saous

Again, only add the '-close' if your decoder is supposed to act on one
product and then exit.

> ps -ef OUTPUT::
> 
> ldm  7916  6493   0 03:15:19 ?           0:16
> /data1/ldm/NAWIPS-5.9.3/os/sol/bin/dcuair -b 24 -m 16 -d
> /data1/ldm/logs/dcuair
> ldm  7222  6493   0 03:14:01 ?           1:14
> /data1/ldm/NAWIPS-5.9.3/os/sol/bin/dclsfc -d /data1/ldm/logs/dclsfc.log -e
> GEMT
> ldm  8100  6493   0 03:16:12 ?           0:55
> /data1/ldm/NAWIPS-5.9.3/os/sol/bin/dcacft -e
> GEMTBL=/data1/ldm/NAWIPS-5.9.3/gem
> ldm  6522  6493   0 00:06:56 ?           1:26
> /data1/ldm/NAWIPS-5.9.3/os/sol/bin/dcmetr -b 9 -m 72 -s sfmetar_sa.tbl -d
> /data
> ldm 14264  6493   0 10:27:35 ?           0:00 /usr/local/bin/perl
> /usr/local/ldm/lo
> cal/bin/saous.ldm.dec data/surface/decoded
> ldm  1980  1979   0 20:38:19 ?           0:01 /usr/dt/bin/dtscreen
> -mode blank
> ldm 14001  6493   0 10:26:12 ?           0:00 /usr/local/bin/perl
> /usr/local/ldm/lo
> cal/bin/saocn.ldm.dec data/surface/decoded
> ldm 14003  6493   0 10:26:12 ?           0:00 /usr/local/bin/perl
> /usr/local/ldm/lo
> cal/bin/saous.ldm.dec data/surface/decoded

All of the GEMPAK processes that are listed as running by your 'ps' command
are designed to keep running so that the single invocation of the decoder
handles all of the data of the type that it has been configured to decode.
An example of this is illustrated in a 'ps' run on a machine we are using
to decode all of the IDD data by GEMPAK:

ps -aux | grep dc | grep -v grep
ldm       7113  1.4  0.1 30272 4452  ??  S    12:42PM   0:30.64 decoders/dcrdf
ldm        294  0.0  0.1 27192 3808  ??  I    Sun08PM   1:13.93 decoders/dcnldn
ldm        321  0.3  0.1 27528 3564  ??  S    Sun08PM  21:21.76 decoders/dcmetr
ldm        332  0.4  0.2 49616 8356  ??  S    Sun08PM   4:14.56 decoders/dcgrib
ldm        338  0.0  0.2 49616 9196  ??  I    Sun08PM   2:34.20 decoders/dcgrib
ldm        426  0.0  0.1 30224 3320  ??  I    Sun08PM  16:32.07 decoders/dctaf
ldm      83948  0.0  0.2 28340 6672  ??  S    Tue10AM  27:11.88 decoders/dcmsfc
ldm      84835  0.0  0.1 27740 3332  ??  S    Tue10AM   1:38.92 decoders/dcuair
ldm      84836  0.1  0.2 31560 8612  ??  S    Tue10AM   7:52.38 decoders/dcacft
ldm      84845  0.0  0.2 28340 6072  ??  S    Tue10AM   1:00.91 decoders/dcmsfc
ldm       6157  0.0  0.1 27676 4476  ??  I    Wed04AM   8:07.74 decoders/dclsfc
ldm      36455  0.9  0.5 49536 17164  ??  S     9:46AM   0:14.86 decoders/dcgrib
ldm      32501  0.0  0.1 36660 3820  ??  I     1:38PM   0:29.74 decoders/dcgrib
ldm      34681  0.0  0.1 42960 3068  ??  I     1:49PM   0:00.84 decoders/dcgrib
ldm      34755  0.0  0.1 43136 4128  ??  I     1:50PM   0:03.17 decoders/dcgrib
ldm      34760  0.0  0.1 43136 3444  ??  I     1:50PM   0:02.63 decoders/dcgrib
ldm      34779  0.0  0.1 36736 3484  ??  I     1:51PM   0:10.83 decoders/dcgrib
ldm      34841  0.3  0.1 36736 3460  ??  I     1:51PM   0:09.62 decoders/dcgrib
ldm      35690  0.0  0.1 32632 3136  ??  I     1:56PM   0:00.22 decoders/dcffa

You can see from this listing that most of the GEMPAK decoders have been running
for days on our system.

Cheers,

Tom
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: SWT-100004
Department: Support LDM
Priority: Normal
Status: Closed