[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20040920: Possible pqact issue in LDM?



Steven,

>Date: Mon, 20 Sep 2004 10:12:07 -0500
>From: "Steven Danz" <address@hidden>
>Organization: Aviation Weather Center
>To: Steve Emmerson <address@hidden>
>Subject: Re: 20040918: Possible pqact issue in LDM?
>Keywords: 200409091803.i89I3pnJ023109

The above message contained the following:

> Sure... the story goes something like this. 
> 
> AWC has a NorthupGrumman NOAAPort receiver system, which is pretty
> much just a stripped down AWIPS CP.  On this system, we have some
> software from FSL that can talk to the AWIPS CP software and for each
> product received on the NOAAPort, insert it into the LDM queue.  So,
> we also have LDM running on this system, configured as a pure data
> source (no 'request' lines in ldmd.conf) to feed the NOAAPort data to
> other systems in the center.  Now, to make a record of the time that
> each product reaches the center on NOAAPort, the LDM on the receiver
> has a small pqact.conf that, for each AWC product, EXEC's a script to
> put a one-line product in the queue that contains the current wall
> clock time, the server name, product name, etc. to give us a record of
> the time that the product arrived from NOAAPort.
> 
> Now, down stream from the NOAAPort receiver, there is an LDM client
> with a pqact configured that stores all these 'receive notification'
> in to a file by product, by day.  We also keep a similar log of every
> transmit of every product from the center.  Then, we have a script
> that takes the send log entries and matches them up with the receive
> log entries to determine delay and to monitor if the NWSTG drops a
> product When ever there is a missing receive entry that is 'too old',
> an alarm goes up on our monitoring software (Nagios is the package we
> are using). So, when there is an alarm on Nagios (and I catch it in
> time before things are flushed from the queue) I quickly log into the
> NOAAPort receiver to check
> 1) is the product in the queue
> 2) is the receive notice in the queue
> 3) is there a log entry from the receive notice script that it attempted 
> to put a notice in the queue
> 4) and when I was running pqact -v, was there an entry that pqact saw 
> the product go by
> 
> So far, each time there has been a problem reported 1) has been fine,
> the product was in the queue, but 2) was not and there was no entry
> in 3) indicating that the script had attempted to run.  When I was
> running 'pqact -v' over the weekend I noticed that there were 'chunks'
> of headers missing when comparing the list of headers to what 'pqcat'
> displayed in the queue.  For example, looking over about 40 minutes
> of the queue, there were about 255 products in 13 'chunks' that pqcat
> listed in the queue, that the 'pqact -v' didn't report seeing.
> 
> Probably too much detail :-)

Not at all.

Are you checking the product-queue too soon after being notified?  Is
the missed data-product later acted-upon by pqact(1), indicating that it
was merely delayed?

Do you have a saved product-queue that pqcat(1) indicates contains
data-products that pqact(1) missed?

If so, if you manually execute pqact(1) on this product-queue, does it
find the "missed" data-products, e.g.,

    echo '<<feedtype>>  (<<pattern>>)   EXEC    -wait   echo \1' >conf
    pqact -vl- -o <<time>> -q <<pq>> conf

where
    <<feedtype>>        Is the feedtype of a data-product that pqact(1)
                        missed.
    <<pattern>>         Is the pattern of a data-product that pqact(1)
                        missed.
    <<time>>            Is the age of the oldest data-product in the
                        product-queue in seconds (use pqmon(1) to
                        determine this).
    <<pq>>              Is the pathname of the saved product-queue.

Are there non-printing characters in the product-identifier of the
"missed" data products that cause them to not be matched?  You can check
the product-identifiers with

    pqcat -vl- -f <<feedtype>> -p <<pattern>> -q <<pq>> -i 0 | od -c

Regards,
Steve Emmerson


NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.