[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20020213: [Fwd: 20020207: missing products]



Harry Edmon wrote:
> 
> The network interface messages are caused by my turning on promiscous mode and
> running tcpdump.  The the FIFO underflow occurs when the machine first boots 
> up,
> but never after the first couple of hours.
> 
> --
> Dr. Harry Edmon                 E-MAIL: address@hidden
> 206-543-0547                            address@hidden
> Dept of Atmospheric Sciences    FAX:    206-543-0308
> University of Washington, Box 351640, Seattle, WA 98195-1640

Hi Harry,

A few thoughts:

The ldm processes on air and sunny are currently in verbose mode.  If
you can spare the space you could put them in debug mode.  (You could
keep fewer logs.  In scanning the logs I see about 12 of these events
occuring within the past 24 hours.  So, I would think you could capture
a few of these events within a relatively short period of time.)  

Debug mode tracks every ldm packet.  This will show us if the relay is
interrupted on product boundaries or within a product.  If the
interruption always occurs on product boundaries, then I would still
consider the possibility of an ldm locking error.  If the interruptions
occur within product transmissions, then I would think it's not a
locking error as I don't think the ldm would relinquish a lock before
it's done reading or writing.  Although an assumption...  I would try to
confirm that if further information deemed it necessary.

And, my question is, why is camano.atmos.washington.edu writing to air's
logs?  I'm not a system administrator, but I don't see how that can
happen.

I was also wondering about a possible deadlock situation between air and
camano, but I guess that wouldn't happen since air is requesting EXP
from camano and camano is requesting NEXRD2 from air.  Is there any way
any of air's downstream sites could be waiting on products that air is
waiting on from that site??

On air, I wrote a python script to scan the logs for gaps in the
entries' timestamps.  (It's in ~ldm/anne.)  There's a file there called
timestampGaps that records the lines around a 30 second gap.  Nothing
jumps out at me, except that some gaps result in lots of disconnects
while others don't.

Anne 
-- 
***************************************************
Anne Wilson                     UCAR Unidata Program            
address@hidden                 P.O. Box 3000
                                  Boulder, CO  80307
----------------------------------------------------
Unidata WWW server       http://www.unidata.ucar.edu/
****************************************************