[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #OHH-743555]: LDM crash due to child seg fault



Hi Ryan,

> We ran into a problem this morning.  One of the child processes that our
> LDM executes was seg faulting.  Somehow that was leading to LDM shutting
> down.  I'm attaching a log file that captures one of the events.  Search
> for "comm_client" in the log file.

The relevant section of the log file is this:

May 17 12:56:58 awcdas1 rpc.ldmd[12327] NOTE: child 12331 terminated by signal 
11: /awc/local/bin/hawrap eth0:0 comm_client -B -s AWC -v 1 -p 53004 -D 
/data/transmit/socket -S /data/sent/socket -F /data/sent/socket/fail -P 
/awc/ops/ldmdas/logs -n 205.156.51.36 -n 205.156.51.37 
May 17 12:56:58 awcdas1 rpc.ldmd[12327] NOTE: Killing (SIGTERM) process group 

The top-level LDM server is designed to terminate the LDM system if any EXEC-ed 
process terminates abnormally. This is a feature and not a bug.

In your case, the process "/awc/local/bin/hawrap eth0:0 comm_client -B -s AWC 
-v 1 -p 53004 -D /data/transmit/socket -S /data/sent/socket -F 
/data/sent/socket/fail -P /awc/ops/ldmdas/logs -n 205.156.51.36 -n 
205.156.51.37" terminated abnormally due to a segmentation violation. I 
recommend either 1) fixing the program so that it doesn't seg-fault 
(valgrind(1) is great for this, as is a debugger) or 2) wrapping the process in 
a script that either restarts the program or terminates normally.

No one from AWC coming here for the LDM workshop?

> Best regards,
> 
> Ryan Solomon
> Aviation Weather Center

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: OHH-743555
Department: Support LDM
Priority: Normal
Status: Closed


NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.