[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #GRA-463091]: pq_del_oldest error



Patrick,

> NCEP runs several instances of ldm-6.7.0 on AIX 5.3 on several nodes of its 
> supercomputer.  One configuration consists of two LDMs running
> on the same node under different user accounts; each LDM having different 
> software inserting different products into its queue.   Twice now
> under this configuration, once yesterday and once on April 7, we have seen 
> these errors:
> 
> pqinsert ERROR: pq_del_oldest: signature
> 00000000000000070000000000001000: Not Found
> Error: unable to send file to local queue
> 
> ERROR: pq_del_oldest: signature 00000000000000070000000000001000: Not Found

Do these two error messages refer to two separate instances?  If so, the 
chances of two data-products having the same signature (MD5 checksum) by random 
chance are one in 2^128 (i.e., not likely).

Those signatures also look odd.

Are you using the "-i" option of pqinsert(1)?

> Both errors occurred with the same LDM on the same node.  As mentioned, we 
> run LDM in several other places and have for quite a while
> (years) and have never before seen this error.  Also, as mentioned we run 
> another LDM concurrently on the node and it does not produce this
> error.
> 
> From reading the few very old support emails regarding this error, it seems 
> that it suggests a disk or memory error.  Before we run this
> down with IBM, is there anything new you can tell us about why this 
> (seemingly rare) error may have occurred twice, or what we might need to
> look at besides hardware?

I can't think of anything other than hardware.  Our systems administrator might 
have some ideas.

> Let me know if you need more details.
> 
> Thanks,
> Patrick
> NCEP Central Operations

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: GRA-463091
Department: Support LDM
Priority: Normal
Status: Closed