[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20020213: [Fwd: 20020207: missing products]



Hi Harry,

The "situation" occurred at 21:22:25 today.  It's currently captured in
ldmd.log.2. Here's some things I found from the log.

Here is a list of the last operations executed by the processes that
were in debug mode before the big disconnect:

connection               last op
----------               -------
sunny                   comingsoon
nm2_9                   comingsoon
neptune                 comingsoon
nmc2_7                  blkdata (prev prod was completed)
nmc2_8                  blkdata (prev prod was completed)
camano                  End of Queue, followed by SIGALRM **

So, none of the processes were in the middle of a transfer.  I guess I
can't rule out a locking problem based on this event.  (I'm not sure if
'comingsoon' is issued before or after it gets a lock.)

** Two processes rec'd a SIGALRM: camano and dragon.  I don't know why
these occurred.  Since rpc.ldmd does not have a handler for that signal,
it was just noted in the log.

In all of ldmd.log.1, an hour's worth of logging, there were no hereis
or comingsoon messages for camano.  Is camano getting the NEXRD2 data? 
I logged on to camano to see what was up, and found what appeared to be
the same log as on air.  And, 'ldmadmin ps' on camano told me I should
only do that on air.  What have you set up there?

Delay messages logged:

connection              number of Delay entries
----------              -----------------------
sunny                       8
nmc2_9                     13
neptune                     1
nmc2_7                      6
nmc2_8                      7
camano(feed)            48691
camano                      5
dragon                  50082
ldm1                    48680
mcclure                     3
nmc2_5                    430
nmc2_6                    428

But, these messages may have to do with the machine not getting any
data, so it's just idling.  Perhaps these are not representative of
anything.  On Larry's machine, I only put the feed from motherlode in
debug mode, and perhaps that connection was never idle.

My other thought is that we could identify the locking regions in the
code, and modify your software to log something around those times. 
That might shed more light on whether the locking is a problem.

I am leaving for the day.  If you don't want to be logging stuff all
weekend, you could turn it off until Monday.

Enjoy your weekend!

Anne
-- 
***************************************************
Anne Wilson                     UCAR Unidata Program            
address@hidden                 P.O. Box 3000
                                  Boulder, CO  80307
----------------------------------------------------
Unidata WWW server       http://www.unidata.ucar.edu/
****************************************************