[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #ADK-338194]: Data product inserted Upstream doesn't arrive Downstream



Leonard,

> > Were the files inserted relatively simultaneously?
> 
> No, the ingester times the inserts, with 3 files going in every 3 minutes.
> 
> > If so, what was the total number of bytes? What's the size of the 
> > product-queue?
> 
> Both up and down have 2GB queue files.
> 
> > How many slots does it have?
> 
> Up:
> Oct 31 22:43:23 pqmon NOTE: nprods nfree  nempty      nbytes  maxprods
> maxfree  minempty    maxext  age
> Oct 31 22:43:23 pqmon NOTE:    856     1  487424  1908828872
> 856        3    487424  91174200 16673
> 
> Down:
> Oct 31 22:43:43 pqmon NOTE: nprods nfree  nempty      nbytes  maxprods
> maxfree  minempty    maxext  age
> Oct 31 22:43:43 pqmon NOTE:    868     6  487407  1934618240
> 868        7    487407  65331808 16530
> 
> > Is there anything untoward in the LDM log file of either the upstream or 
> > downstream site?
> 
> Up:
> They are inserted in order of the usual filename sorting.  Here's what
> the log says around when an insert was done:
> 
> Oct 31 17:00:31 DEBUG: Mapping 2055557120
> Oct 31 17:00:33 DEBUG: Deleting oldest to make space 126874592 bytes
> pqinsert INFO: 8b08e08b6656607252fc4a466a4c8739 126874492
> 20121031170031.913     EXP 000  comp/20121030
> .134809.FTS.gz
> Oct 31 17:03:33 DEBUG: Mapping 2055557120
> pqinsert INFO: 615eeab67055f7f1f60dd6b198727ccf 126932335
> 20121031170333.963     EXP 000  comp/20121030.134856.FTS.gz
> Oct 31 17:03:35 DEBUG: Mapping 2055557120
> Oct 31 17:03:36 DEBUG: Deleting oldest to make space 126605168 bytes
> pqinsert INFO: f6a004fd47fad266f9c82e7570edd373 126605070
> 20121031170335.357     EXP 000  comp/20121030.134942.FTS.gz
> Oct 31 17:03:36 DEBUG: Mapping 2055557120
> Oct 31 17:03:38 DEBUG: Deleting oldest to make space 126268224 bytes
> pqinsert INFO: d9064bd81f5596b9281534d18b32e09d 126268126
> 20121031170336.633     EXP 000  comp/20121030.135029.FTS.gz
> 
> The individual files can be up to 250MB each, and it should take about 1
> minute to transfer, so I made the queues 2GB.

Offhand, I'd say that the product-queues need to be larger. Such large files 
don't give the product-queue space-recovery algorithm much to work with. When 
the product-queue needs more room for a new product, it deletes products 
beginning with the oldest until sufficient space is available.

> Down:
> Looking at around the same time in the log:
> 
> Oct 30 16:59:43 mlsodata pqact[23067] NOTE: Filed in
> "/home/ldm/data/comp/20121004.073858.FTS.gz": 126955547
> 20121030225712.286     EXP 000  comp/20121004.073858.FTS.gz
> Oct 30 16:59:58 mlsodata pqact[23067] ERROR: child 23475 exited with
> status 2 (EXEC /bin/gunzip comp/20121004.073858.FTS.gz)
> Oct 30 17:00:39 mlsodata pqact[23067] NOTE: Filed in
> "/home/ldm/data/comp/20121004.073944.FTS.gz": 126952281
> 20121030225713.969     EXP 000  comp/20121004.073944.FTS.gz
> Oct 30 17:00:54 mlsodata pqact[23067] ERROR: child 23484 exited with
> status 2 (EXEC /bin/gunzip comp/20121004.073944.FTS.gz)
> Oct 30 17:01:27 mlsodata pqact[23067] NOTE: Filed in
> "/home/ldm/data/comp/20121004.074037.FTS.gz": 126466253
> 20121030230015.905     EXP 000  comp/20121004.074037.FTS.gz
> Oct 30 17:01:42 mlsodata pqact[23067] ERROR: child 23507 exited with
> status 2 (EXEC /bin/gunzip comp/20121004.074037.FTS.gz)
> 
> These messages are because the destination files already existed
> gunzipped, so gunzip wouldn't overwrite them.  They were there from a
> transfer that took place before I recreated the queue files.
> 
> And at around the time that the files should have arrived, in sequence
> of the filename sorting order:
> 
> Oct 31 11:14:15 mlsodata pqact[23067] NOTE: Filed in
> "/home/ldm/data/comp/20121030.134809.FTS.gz": 126874492
> 20121031170031.913     EXP 000  comp/20121030.134809.FTS.gz
> Oct 31 11:16:21 mlsodata pqact[23067] NOTE: Filed in
> "/home/ldm/data/comp/20121030.135029.FTS.gz": 126268126
> 20121031170336.633     EXP 000  comp/20121030.135029.FTS.gz
> 
> I would expect 20121030.134856.FTS to arrive between those two (not by
> design, but by operating in a simple situation I have).
> 
> BTW, I'm searching in all the log files I have in the logging directory,
> so they aren't missing because the log was rolled to another file.
> 
> --
> ==Leonard E. Sitongia
> High Altitude Observatory
> National Center for Atmospheric Research
> P.O. Box 3000 Boulder CO 80307  USA
> address@hidden  voice: (303)497-2454  fax: (303)497-1589

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: ADK-338194
Department: Support LDM
Priority: Normal
Status: Closed