[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #ZRE-293921]: Dropped products



Harry,

> >Is Freshair the only system sending those data-products to the spare host?
> 
> Yes.
> 
> >Can you send me the initial log entries for the upstream LDM processes on 
> >Freshair that are responsible for sending the data-products to the spare 
> >host.  Is there only one such process?
> >
> >
> Only one process.  NOTE: freshair2 is currently freshair.

Then there can't be any problems due to multiple feeds to the spare host.  
That's good.

>  freshair1
> (aka otherhost to freshair2) is the spare.  Also - all times in the log
> files are local (PDT).

PDT?!  &$#@ Harry!     :-)

> Apr 19 15:28:25 freshair2 otherhost(feed)[24482] NOTE: Starting
> Up(6.4.5/6): 20060419222748.375 TS_ENDT {{ANY,  ".*"}}, Primary
> Apr 19 15:28:25 freshair2 otherhost(feed)[24482] NOTE: topo:  otherhost
> {{ANY, (.*)}}

That looks good.  The spare is requesting data-products that were created about 
six minutes earlier at 20060419T222748.375.

> >Can you send me an INFO log-entry on Freshair that shows reception of a 
> >data-product that wasn't relayed to the spare host but should have been?
> >
> Apr 20 19:24:38 freshair2 unidata2.ssec.wisc.edu[24442] INFO:      120
> 20060421021819.211 IDS|DDPLUS 41583079  SRBZ40 KWAL 210216
> Apr 20 19:24:38 freshair2 mkwc2.IfA.Hawaii.Edu(feed)[3932] INFO:
> sending:      120 20060421021819.211 IDS|DDPLUS 41583079  SRBZ40 KWAL
> 210216
> Apr 20 19:24:39 freshair2 blow.atmos.washington.edu(feed)[1236] INFO:
> sending:      120 20060421021819.211 IDS|DDPLUS 41583079  SRBZ40 KWAL
> 210216

The above data-product was received about six minutes after it was created.  
What were the immediately before and after log-entries fom the upstream LDM 
process sending to the spare host (process 24482).

> >How often did the connection between Freshair and the spare host break 
> >during that time period?
> >
> I think 3 times for timeouts (note: for freshair1, freshair2 appears as
> otherhost):
> 
> Apr 20 20:01:03 freshair1 otherhost[3333] INFO: Connection from upstream
> LDM sil
> ent for 60 seconds
> Apr 20 20:01:03 freshair1 otherhost[3333] INFO: Resolving otherhost to
> an IP add
> ress took 0.00016 seconds
> Apr 20 20:01:22 freshair1 otherhost[3333] INFO: Upstream LDM is alive.
> Waiting.
> .
> Apr 21 02:00:05 freshair1 otherhost[3333] INFO: Connection from upstream
> LDM sil
> ent for 60 seconds
> Apr 21 02:00:05 freshair1 otherhost[3333] INFO: Resolving otherhost to
> an IP add
> ress took 0.000122 seconds
> Apr 21 02:00:06 freshair1 otherhost[3333] INFO: Upstream LDM is alive.
> Waiting.
> .
> Apr 21 02:57:33 freshair1 otherhost[3333] INFO: Connection from upstream
> LDM sil
> ent for 60 seconds
> Apr 21 02:57:33 freshair1 otherhost[3333] INFO: Resolving otherhost to
> an IP add
> ress took 0.000136 seconds
> Apr 21 02:57:36 freshair1 otherhost[3333] INFO: Upstream LDM is alive.
> Waiting.
> .

Actually, the above entries don't indicate disconnections; they indicate that 
nothing was received for 60 seconds (three times).  This means that the 
data-products weren't lost due to reconnections.

> >Are the creation-times of the missing data-products before the start-times 
> >in the request-for-data from the spare host?
> 
> Not sure how to answer this one.

Get me those log-entries I mentioned and I'll answer it.

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: ZRE-293921
Department: Support LDM
Priority: Normal
Status: On Hold