[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #VOM-137597]: Use of pqexpire causing ldm to 'lock up'?



> The 'process' is that the user runs a pqinsert at one end of the
> 'pipeline' to insert a product into the LDM, which feeds down into
> several other LDMs that process (generate images, store, foreward,
> decode, etc) the product.  The problem is that within the course of
> 10-15 minutes the user could be required to insert a product with the
> same content twice.  They don't have access to the other systems
> downstream (firewalls, network separation, etc) so the LDM is the means
> used to move the data around and have it processed 'real-time'.

Could you add a counter or timestamp to the beginning or end of the 
data-product so that it was unique?  Others have done this for identical 
products and this solution works well with the LDM.

> Yes, running pqexpire with the -i 0 switch.  The products that cause an
> expire are inserted within a minute (or less) of each other.  The expire
> is on a specific product (based on the LDM id), no wildcards.  There is
> one account inserting, but could have more than one process running at
> once performing the insert upstream.  The server that is locking up is
> at least 1 LDM server away from the point where the product is inserted
> into the queue. The LDM is 6.2.1 (no, haven't had a chance to get to
> 6.4.5 yet) with a 800Meg queue.  All systems are either RedHat
> Enterprise 3 or RedHat Enterprise 4 running on either dual Xeon
> 2.4Ghz/2Gig or quad Xeon 2.8Ghz/4Gig servers. The configure.log is no
> longer available as the source area was cleaned up after the install.
> 
> So far, removing the -w switch seems to have resolved the issue.  I
> realize this is very timing specific.  I was mostly trying to find out
> if there could be a deadlock issue when running pqexpire (or more than
> one) with the -w while other processes are working with the queue.

pqexpire(1) will print an ERROR level message if a file-deadlock would occur in 
attempting to write-lock a region of the product-queue.

I'm examining the pqexpire(1) code.  I'll let you know if I discover anything.

Let me know if you disambiguate the inserted products or if the pqexpire(1)s 
without the "-w" option don't work for you.

> Thanks for your time.
> 
> --
> Steven Danz
> Senior Software Development Engineer
> Aviation Weather Center (NOAA/NWS/NCEP)
> 7220 NW 101st Terrace, Room 101
> Kansas City, MO 64153-2371
> 
> Email: address@hidden
> Phone: 816.584.7251
> Fax:   816.880.0650
> URL:   http://aviationweather.gov/
> 
> The opinions expressed in this message do not necessarily reflect those
> of the National Weather Service, or the Aviation Weather Center.
> 
> 

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: VOM-137597
Department: Support LDM
Priority: Normal
Status: On Hold