[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20050105: dcmetr core dumps



Ben,

Sounds like your file yesterday got corrupted, so thateach time the decoder
tried to write to it, you saw the crash.

Common problems to look for when a file gets corrupted:
1) data partition ran out of space. Check system log messages.
2) pqact running behind...possibly due to iowait on your disk, such that it 
fires up a second copy of the decoder to write to the same data file.
You can issue 2 "kill -USR2 pid" to the pqact process running the decoders
to look at the ldm log messages and see what "delay" exists. If you
see large delay values, then you would want to mittigate the IO
bottleneck. (Issue a third "kill -USR2 pid" to cycle pqact back to
quiet mode...otherwise you could fill up you log files quickly).

Steve Chiswell
Unidata User Support




>From: Ben Cotton <address@hidden>
>Organization: UCAR/Unidata
>Keywords: 200501052138.j05Lciv2010460

>Howdy,
>
>I'm brand new to Gempak, but one of our professors has requested to use
>it.  The sysadmin-types installed the software, and I set up the pqact
>file, but I've run into a problem.
>
>dcmetr has been dumping core every time it is executed, which adds up
>pretty quickly.  Not knowing anything about Gempak, I'm having a hard time
>figuring out what the issue is.  Here is the pqact entry that invokes
>dcmetr:
>
>WMO    ^S[AP]
>       PIPE    decoders/dcmetr -v 2 -a 500 -m 72 -s sfmetar_sa.tbl
>       -d data/gempak/logs/dcmetr.log
>       -e GEMTBL=/opt/gempak/gempak/tables
>       data/gempak/surface/YYYYMMDD_sao.gem
>
>Here's a sample of the dcmetr.log from when the core dumps were occurring
>
>[22844] 050104/0017 [DC 3] Version 5.7.2p2
>[22844] 050104/0017 [DCMETR 7] 3.3
>[22844] 050104/0017 [DC 2] read 739/102399 bytes strt 0 newstrt 739
>[22852] 050104/0017 [DC 3] Version 5.7.2p2
>[22852] 050104/0017 [DCMETR 7] 3.3
>[22852] 050104/0017 [DC 2] read 105/102399 bytes strt 0 newstrt 105
>[22854] 050104/0017 [DC 3] Version 5.7.2p2
>[22854] 050104/0017 [DCMETR 7] 3.3
>[22854] 050104/0017 [DC 2] read 139/102399 bytes strt 0 newstrt 139
>[22891] 050104/0017 [DC 3] Version 5.7.2p2
>[22891] 050104/0017 [DCMETR 7] 3.3
>[22891] 050104/0017 [DC 2] read 106/102399 bytes strt 0 newstrt 106
>[22892] 050104/0017 [DC 3] Version 5.7.2p2
>[22892] 050104/0017 [DCMETR 7] 3.3
>[22892] 050104/0017 [DC 2] read 128/102399 bytes strt 0 newstrt 128
>
>The file from yesterday caused GARP to dump, according to Dr. Houston.
>However, I limited the coredump size and uncommented the dcmetr entry and
>todays file worked okay.  Any thoughts?
>
>
>Thanks,
>Ben
>
>====================
>Benjamin J. Cotton, KC9FYX
>LDM/Forecast Game Administrator
>Dept of Earth and Atmos. Sci.
>Purdue University
>
>address@hidden
>(765)743-6083   (502)551-5403
>web.ics.purdue.edu/~bcotton
>
>
--
NOTE: All email exchanges with Unidata User Support are recorded in the
Unidata inquiry tracking system and then made publicly available
through the web.  If you do not want to have your interactions made
available in this way, you must let us know in each email you send to us.