[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #VXR-367622]: ldm segfault

Hi Karen!

> I have an unusual situation here.  Something I've not seen before anyway...
> I have 6 ldm servers right now.  Three sets of redundant pairs with
> different data on them.  Two are older hardware, while 4 are relatively
> new hardware.  All are running 6.8.1 on RedHat Linux 5 and all are
> running the queue out of RAM disk.  They've been running since October
> with no problems.
> This past weekend I had a very unusual occurrence.  Two of the servers
> (with the newer hardware) that have duplicate feeds, both had rpc.ldmd
> segfault within 1 minute of each other.

Were the two servers also feeding from each other?

> All ldm processes exited and
> the queues were not zeroed out.  I ran ldmadmin clean, remade my queues
> and restarted and everything was good until Monday.  Monday morning one
> of the servers segfaulted, and the other followed suit, but not until
> several hours later.   Surprisingly I didn't get any core dumps,
> although I'm not entirely sure why at this time.
> These systems don't run a pqact, and with the exception of a monitoring
> program (hobbit--which informed me that rpc.ldmd had stopped!)
> everything running on them is stock.  Basically ldm is their entire job.
> Not sure if it's important, but as for data feeds, they get a lot of
> different feed types, including NEXRAD2, NNEXRAD, FSL2, FSL4, FSL5, WMO,
> DDS, HDS, IDS and some EXP (mesonet and refractivity).

I think we get all those feeds here (except for some of the EXP, probably) and 
nothing happened. I haven't heard of any other LDM-s segfaulting recently.

> I'm just wondering if you guys have any ideas or insight?

Besides the LDM setups, what else do the two systems have in common? OS? 
Version? Hardware?

> -------------------------------------------
> There are 2 kinds of people in the world:
> 1) Those who can extrapolate from incomplete data.
> -------------------------------------------
> address@hidden
> Phone:  405-325-6982
> Cell: 405-834-8559
> INDUS Corporation
> National Severe Storms Laboratory

Steve Emmerson

Ticket Details
Ticket ID: VXR-367622
Department: Support LDM
Priority: Normal
Status: Closed

NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.