[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #BMK-187963]: LDM Redundancy and Duplicated Files



Hi,

re:
> Thank you so much for your inputs.  I apologize for the delayed response.

No worries.

re:
> It is
> because we've been performing different test configurations and trying to 
> understand
> where the duplicated files are coming from.  It seems that the LDMs are 
> rejecting the
> duplicated files as expected (i.e. same md5 checksums).  The reason for us 
> seeing the
> duplicated files is because we can get up to 4 files of the same md5 
> checksums (2
> pairs of the same names).  This data is coming from an external site so we 
> don't
> have any control over them.

If the MD5 signature for a product is the same as an MD5 signature for a product
already/still in the local LDM queue, the newly received one will be rejected
regardless of the number of duplicates received.  If the original product ages
out of the queue before a "duplicate" is received, the LDM will insert the
newly received product into the local LDM queue because it will have no idea
that it is a duplicate.

re:
> It also has to do with how the LDMs are set up.  For
> the purpose of redundancy, the LDMs are crisscrossed.  However, that means 
> each
> downstream LDM is see twice the amount of the same files.

We have our systems setup this way (crisscrossed), and all truly duplicate 
products
(meaning that the MD5 signature is the same) are rejected.  We have run in this
setup for years with no problems.

re:
> We are still testing to see how we can reconfigure the LDMs to reduce the 
> number of
> duplications.

The keys to eliminating duplicates are:

- make sure that the MD5 signatures are calculated the same way on all upstream
  machines

- make sure that the product residency time in the local LDM queue is long 
enough
  so that the original is still in the local queue when the duplicate is 
received

re: 
> I do have a question:  Is there a ldmadmin option (or a LDM utility) that can 
> be used
> to see what files are rejected from the LDM queue due to duplication (same md5
> checksums)?

I am not 100% sure.  I will ask Steve (the LDM developer about this when I talk 
to him
this afternoon.  It is possible that if you put LDM logging into debug mode, the
duplicate product rejection will be logged.  Be aware, however, that there are 
a LOT
of log messages generated in debug log mode!

re:
> Thank you in advance!

No worries.

Cheers,

Tom
--
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: BMK-187963
Department: Support LDM
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.