[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20040826: More Gempak Decoder Problems



Tom,

You can kill the processes if you see them chewing up
some cpu. I suspect that the problem you see with the dcwcn
is an improper use of the new VTEC by the sending WFO,
so will be investigating that.

We did find that the new tdl library in Linux can chew up the cpu 
otherwise for unknown reasons (especially for binaries not
compiled with the same version of gcc as the kernel).

Steve Chiswell
Unidata User Support

>From: Tom McDermott <address@hidden>
>Organization: UCAR/Unidata
>Keywords: 200408261610.i7QGAZXn004457

>Hi,
>
>The problem that I'm having with gempak 5.7.2p2 decoders failing to 
>terminate and chewing up the CPU is not confined to dcgrib2 operating on 
>the ocean grids, as I believed and stated in my message of yesterday 
>(8/25).  This morning I came into work to see that 2 rogue dcwcn processes 
>had hijacked the CPU:
>
>last pid:  6639;  load averages:  4.38, 4.44, 4.35     08:50:11
>110 processes: 104 sleeping, 4 running, 1 zombie, 1 on cpu
>CPU states:  0.0% idle, 81.7% user, 18.3% kernel, 0.0% iowait, 0.0% swap
>Memory: 512M real, 13M free, 285M swap in use, 1103M swap free
>
>   PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
>  6915 ldm        1  42    0   26M 1072K run    193:53 19.06% dcwcn
>  3093 ldm        1  42    0   26M 1072K run    247:57 18.94% dcwcn
>17086 ldm        1  51    0  390M  339M run    224:42 18.28% pqact
>  6512 ldm        1  42    0   24M 6344K run      0:15 15.62% dcmsfc
>  6339 ldm        1  41    0   24M 3776K sleep    0:20  7.92% dclsfc
>  6639 ldm        1  58    0 1568K 1352K cpu      0:00  2.05% top
>  6313 ldm        1  58    0   24M 3376K sleep    0:05  1.42% dcuair
>  1915 ldm        1  48    0   24M 3032K sleep    1:59  1.30% dcmetr
>17088 ldm        1  58    0  390M  298M sleep   36:38  0.81% pqbinstats
>  6582 ldm        1  34    0   27M 3600K sleep    0:00  0.75% dctaf
>   286 root       5  58    0 4792K 2096K sleep   44:56  0.60% automountd
>17096 ldm        1  59    0  393M  244M sleep    5:03  0.46% rpc.ldmd
>17095 ldm        1  59    0  390M  218M sleep    7:28  0.34% rpc.ldmd
>17093 ldm        1  59    0  392M  304M sleep   11:33  0.26% rpc.ldmd
>17100 ldm        1  59    0  390M   54M sleep    3:46  0.24% rpc.ldmd
>17099 ldm        1  58    0  390M  267M sleep    4:16  0.15% rpc.ldmd
>18082 ldm        1  58    0  390M  289M sleep    1:47  0.14% rpc.ldmd
>17097 ldm        1  59    0  390M   47M sleep    4:23  0.14% rpc.ldmd
>
>[2361] vortex% ps -ef | grep dcwcn
>      ldm  6915 17086 20 20:05:08 ?       193:57 decoders/dcwcn -d 
>data/gempak/logs/dcwcn.log -e GEMTBL=/weather/GEMPAK5.7.2p2/g
>      ldm  3093 17086 19 18:31:07 ?       248:02 decoders/dcwcn -d 
>data/gempak/logs/dcwcn.log -e GEMTBL=/weather/GEMPAK5.7.2p2/g
>      ldm  6658 10139  0 08:50:33 pts/1    0:00 grep dcwcn
>
>So these 2 processes had been running for more than 12 hours.  Is there 
>any solution to this problem other than restarting the ldm periodically?
>
>Tom
>-----------------------------------------------------------------------------
>Tom McDermott                          Email: address@hidden
>Systems Administrator                  Phone: (585) 395-5718
>Earth Sciences Dept.                   Fax: (585) 395-2416
>SUNY College at Brockport
>
--
NOTE: All email exchanges with Unidata User Support are recorded in the
Unidata inquiry tracking system and then made publically available
through the web.  If you do not want to have your interactions made
available in this way, you must let us know in each email you send to us.