[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Support #CYR-432236]: NWS/NCEP/AWC LDM 6.7.0 woes



Mick,

> I'm running into some LDM issues and Marc Singer recommended I emailed
> you with my findings.
> 
> First, here are some system facts:
> 
> CPU:  2 x dual-core AMD Opteron 2216 @ 2.4 GHz
> MEM:  6 GiB
> OS:   RHEL5.3 x86_64
> Kernel:       2.6.18-128.1.1.el5
> GCC:  gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-44)
> LDM:  6.7.0

Nice system.  The LDM should run on it.

> I'm trying to build LDM 6.7.0, and while the build process itself had an
> issue ($LDMHOME/$VDIR/src/server - complaint about undefined
> 'ldm_version'), I was able to get past that:
> 
> ==============8<=========================
> --- Makefile.old        2009-03-25 14:31:40.000000000 +0000
> +++ Makefile    2009-03-25 12:27:35.000000000 +0000
> @@ -4,8 +4,9 @@
> #
> include ../macros.make
> 
> -INCLUDES = -I../config -I../misc -I../ulog -I../protocol -I../pq
> +INCLUDES = -I ../ -I../config -I../misc -I../ulog -I../protocol -I../pq
> TAG_SRCS       = \
> +       ../*.c ../*.h \
> ../misc/*.c ../misc/*.h \
> ../ulog/*.c ../ulog/*.h \
> ../protocol/*.c ../protocol/*.h \
> ==============8<=========================
> 
> However, when trying to create a queue (through ldmadmin or from
> commandline directly) I get a SIGSEGV. Here's an strace:
> 
> ==============8<=========================
> address@hidden pqcreate]$ strace ./pqcreate -v -f -s 100 -S 10 -q
> $HOME/data/ldm.pq
> execve("./pqcreate", ["./pqcreate", "-v", "-f", "-s", "100", "-S", "10",
> "-q", "/usr/local/ldm/data/ldm.pq"], [/* 22 vars */]) = 0
> brk(0)                                  = 0xb86d000
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
> = 0x2b321f397000
> uname({sys="Linux", node="server_101.eee", ...}) = 0
> access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or
> directory)
> open("/etc/ld.so.cache", O_RDONLY)      = 3
> fstat(3, {st_mode=S_IFREG|0644, st_size=35594, ...}) = 0
> mmap(NULL, 35594, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2b321f398000
> close(3)                                = 0
> open("/lib64/libm.so.6", O_RDONLY)      = 3
> read(3,
> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`>\300\0239\0\0\0"...,
> 832) = 832
> fstat(3, {st_mode=S_IFREG|0755, st_size=615136, ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
> = 0x2b321f3a1000
> mmap(0x3913c00000, 2629848, PROT_READ|PROT_EXEC,
> MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3913c00000
> mprotect(0x3913c82000, 2093056, PROT_NONE) = 0
> mmap(0x3913e81000, 8192, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x81000) = 0x3913e81000
> close(3)                                = 0
> open("/lib64/libc.so.6", O_RDONLY)      = 3
> read(3,
> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\332\201\0229\0\0\0"...,
> 832) = 832
> fstat(3, {st_mode=S_IFREG|0755, st_size=1713088, ...}) = 0
> mmap(0x3912800000, 3494168, PROT_READ|PROT_EXEC,
> MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3912800000
> mprotect(0x391294c000, 2097152, PROT_NONE) = 0
> mmap(0x3912b4c000, 20480, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x14c000) = 0x3912b4c000
> mmap(0x3912b51000, 16664, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3912b51000
> close(3)                                = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
> = 0x2b321f3a2000
> arch_prctl(ARCH_SET_FS, 0x2b321f3a27c0) = 0
> mprotect(0x3912b4c000, 16384, PROT_READ) = 0
> mprotect(0x3913e81000, 4096, PROT_READ) = 0
> mprotect(0x391261b000, 4096, PROT_READ) = 0
> munmap(0x2b321f398000, 35594)           = 0
> fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
> = 0x2b321f398000
> write(1, "pqfname=/usr/local/ldm/data/ldm."...,
> 47pqfname=/usr/local/ldm/data/ldm.pq, pflags=129
> ) = 47
> write(2, "Creating /usr/local/ldm/data/ldm"..., 61Creating
> /usr/local/ldm/data/ldm.pq, 100 bytes, 10 products.
> ) = 61
> brk(0)                                  = 0xb86d000
> brk(0xb88e000)                          = 0xb88e000
> open("/usr/local/ldm/data/ldm.pq", O_RDWR|O_CREAT|O_EXCL|O_TRUNC, 0666) = 3
> fcntl(3, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=0, len=4096}) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> lseek(3, 12284, SEEK_SET)               = 12284
> write(3, "\0\0\0\0", 4)                 = 4
> mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x2b321f399000
> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> +++ killed by SIGSEGV +++
> ==============8<=========================
> 
> The result is the same regardless of if the filesize is 100 B or 500
> MiB. It looks like that mmap makes the kernel unhappy, but I'm not
> skilled enough to see why.

I am skilled enough, and a SIGSEGV shouldn't have been issued ---
at least not according to the strace(1) output (thanks for that).

Firstly, is /usr/local/ldm/data local to the system?  Is it on a
RAID?

It's possible that the SIGSEGV didn't occur within the mmap(2) call
but from just after it's return.  Would you please execute the same
pqcreate(1) command in a debugger and see where the SIGSEGV occurs.
It would be best if debugging were enabled by 1) executing "make
distclean" in the top-level source-directory; 2) setting the
environment variable CFLAGS to "-g"; 3) executing the "configure"
script; and 4) executing "make".

When that's done, then cd(1) into the "pqcreate" directory and execute
the commands

    rm /usr/local/ldm/data/ldm.pq
    gdb pqcreate

Inside gdb(1), execute the command

    run -s 400M -q /usr/local/ldm/data/ldm.pq

Send me a stack trace when it receives a SIGSEGV.

 The corresponding lines (187,188) in
> pqcreate.c are:
> 
> ==============8<=========================
> errnum = pq_create(pqfname, 0666, pflags,
> 0, initialsz, nproducts, &pq);
> ==============8<=========================
> 
> I did not save the configure/make logfiles thinking it was probably an
> issue past that, but if you want me to send those to you, I can.
> 
> Thank you in advance for any insight you can offer.


Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: CYR-432236
Department: Support LDM
Priority: Normal
Status: Closed


NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.