Hi Daryl, > Annoying me again. Previously, I bugged you about slow pipes not > reporting what process it was: > > http://www.unidata.ucar.edu/support/help/MailArchives/ldm/msg04879.html > > Thanks for implementing this, hopefully others found it useful. > > Now, I am trying to figure out which of my buggy decoders is exiting > badly. As my logs are filling with this: > > Feb 04 16:21:36 mesonet pqact NOTE: child 2155 exited with status 1 > Feb 04 16:26:16 mesonet pqact NOTE: child 8102 exited with status 1 > Feb 04 16:35:39 mesonet pqact NOTE: child 18758 exited with status 1 > Feb 04 16:36:58 mesonet pqact NOTE: child 20265 exited with status 1 > > So I do the -USR2 to pqact, but the logs I get are not inuitive as to > which product going to which processor is actually erroring out. The > child PIDs are not included in the logs, unless I am missing something? > For example: > > Feb 04 14:57:41 mesonet pqact INFO: 115 20080204145112.042 > IDS|DDPLUS 119265941 SPCN46 CWAO 041446 > Feb 04 14:57:41 mesonet pqact INFO: pipe: dcmetr > -b 9 -m 72 -s /mesonet/TABLES/awos.stns -d logs/dcmetr_awos.log -a 0 > /mesonet/data/gempak/awos/YYMMDD_awos.gem > Feb 04 14:57:41 mesonet pqact INFO: pipe: dcmetr > -b 9 -m 72 -s /mesonet/TABLES/mesonet4.stns -d logs/dcmetr_meso1.log > -a 0 /mesonet/data/gempak/meso/YYMMDD_meso.gem > Feb 04 14:57:41 mesonet pqact INFO: pipe: dcmetr -b > 9 -m 72 -s /mesonet/TABLES/asos.stns -d logs/dcmetr_asos.log -a 0 > /mesonet/data/gempak/asos/YYMMDD_asos.gem > Feb 04 14:57:41 mesonet pqact NOTE: child 27014 exited with status > 1 > > > Looking at the source (at least trying to), I see a case where child > exiting with some status may not print out the process name. I tried to > diagnose how this happens, but only confused myself. > > Any comments on this? Because no command-line was printed by "pqact", the child process was either due to an EXEC entry in the "pqact" configuration-file or it was due to a PIPE entry and "pqact" closed the pipe because it needed a file-descriptor for a new process and nothing had been written to that pipe for the longest time (closing a pipe removes the associated entry from an internal list with the consequent loss of the command-line). Can you have your decoders write a "Starting up" message to the LDM log file? This would allow you to match-up the PID-s. > thanks! > daryl Regards, Steve Emmerson Ticket Details =================== Ticket ID: WDX-973084 Department: Support LDM Priority: Normal Status: On Hold
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.