[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[TIGGE #EVX-684652]: ldm crash


> This is what is printed in the log:
> Oct 18 20:24:03 tigge-ldm rpc.ldmd[28636] NOTE: Starting Up (version:
>; built: May 11 2006 19:06:12)

So you're running version of the LDM, eh?  Version 6.4.6 (which isn't 
out yet) has a modification that might solve the problem of the previous crash. 
 I'll see about releasing it to you.

> One hour ago we had another core dump. I attach call stack:
> ldm@tigge-ldm:~> gdb -c core.20061018  bin/rpc.ldmd
> GNU gdb 6.3
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "x86_64-suse-linux"...Using host libthread_db
> library "/lib64/tls/libthread_db.so.1".
> Core was generated by `rpc.ldmd -P 388 -v -q /usr/local/ldm/data/ldm.pq
> /usr/local/ldm/etc/ldmd.conf'.
> Program terminated with signal 6, Aborted.
> Reading symbols from /lib64/libm.so.6...done.
> Loaded symbols for /lib64/libm.so.6
> Reading symbols from /lib64/libc.so.6...done.
> Loaded symbols for /lib64/libc.so.6
> Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
> Loaded symbols for /lib64/ld-linux-x86-64.so.2
> Reading symbols from /lib64/libnss_files.so.2...done.
> Loaded symbols for /lib64/libnss_files.so.2
> Reading symbols from /lib64/libnss_dns.so.2...done.
> Loaded symbols for /lib64/libnss_dns.so.2
> Reading symbols from /lib64/libresolv.so.2...done.
> Loaded symbols for /lib64/libresolv.so.2
> #0  0x00002aaaaad484f9 in kill () from /lib64/libc.so.6
> (gdb) where
> #0  0x00002aaaaad484f9 in kill () from /lib64/libc.so.6
> #1  0x00002aaaaad4959d in abort () from /lib64/libc.so.6
> #2  0x00002aaaaad7c7be in __libc_message () from /lib64/libc.so.6
> #3  0x00002aaaaad8176c in malloc_printerr () from /lib64/libc.so.6
> #4  0x00002aaaaad8225a in free () from /lib64/libc.so.6
> #5  0x00002aaaaad9b085 in tzset_internal () from /lib64/libc.so.6
> #6  0x00002aaaaad9bab1 in __tz_convert () from /lib64/libc.so.6
> #7  0x0000000000415477 in vulog (pri=134, fmt=0x42d4fc "SIGCHLD",
> args=0x7fffffe90db0) at ulog.c:461
> #8  0x000000000041604b in uinfo (fmt=0x42d4fc "SIGCHLD") at ulog.c:1015
> #9  0x000000000040b62f in signal_handler (sig=17) at ldmd.c:283
> #10 <signal handler called>
> #11 0x00002aaaaad88722 in strlen () from /lib64/libc.so.6
> #12 0x00002aaaaad88126 in strdup () from /lib64/libc.so.6
> #13 0x00002aaaaad9ae3a in tzset_internal () from /lib64/libc.so.6
> #14 0x00002aaaaad9ba0f in tzset () from /lib64/libc.so.6
> #15 0x00002aaaaada0424 in strftime_l () from /lib64/libc.so.6
> #16 0x00000000004154fe in vulog (pri=134, fmt=0x42d438 "child %d exited
> with status %d", args=0x7fffffe91d80)
> at ulog.c:468
> #17 0x000000000041604b in uinfo (fmt=0x42d438 "child %d exited with
> status %d") at ulog.c:1015
> #18 0x000000000040b421 in reap (pid=-1, options=1) at ldmd.c:151
> #19 0x000000000040bf58 in sock_svc (sock=0) at ldmd.c:721
> #20 0x000000000040c5a2 in main (ac=7, av=0x7fffffe92098) at ldmd.c:981

It appears from the stack trace that the LDM server was logging the fact that a 
child process had terminated with it received another (or possibly the same) 
SIGCHLD signal and re-entered the logging module to write another message.  
This is a different crash than the previous one.  It could be that the logging 
module can't handle that situation.  I'll check.

Steve Emmerson

Ticket Details
Ticket ID: EVX-684652
Department: Support IDD TIGGE
Priority: Normal
Status: On Hold