20020401: LDM performance vs scp?


The LDM has several features that add some overhead to
inserting a product into the queue in order to
deliver that product to the downstream site. If you are
strictly looking at transfering a file, then some of
the LDM's overhead may not be of interest to you.

If you are using pqinsert to insert your products into the
data queue, then the program will be computing an MD5 checksum
for the product which will be used to uniquely identify
the product. This checksum computation is done upon insertion
of the product into the queue on the upstream machine,
so that time being spent is CPU time, and not network latency.
As products become large, the amount of time needed to compute
the checksum will increase. The benefit of the MD5 checksum
exists in duplicate detection where you have interconnected LDM's
or data that can be inserted from more than one location
(such as we have with multiple sites ingesting the NOAAPORT
data stream and inserting into the IDD from around the
country). The MD5 checksum is also capable of being computed
on a stream, as opposed to waiting to compute the value
once a large file has been composed (pqinsert as mentioned above will
be inserting an entire file, and so you will be computing
the checksum for the entire file at once and not benefiting from
a stream update of the value).

The process of inserting the data into the LDM product queue is
an additional bit of overhead. As Russ Rew mentioned, the LDM
product queue searches are well suited to inserting,
delivering, and deleting many products. If you are not sending
many files, some of the product queue overhead may be unnecessary.
Again, as in the MD5 checksum above, this is not network overhead related.

Another feature of the LDM is the ability to provide access to the data
stream. Again, if you are only sending completed files, then this feature
may not be one that you need.

Lastly, the LDM product queue is memory mapped, allowing the distribution
of your product to multiple downstream sites as they are ready to receive
the product. If you are only transfering the file to a single host, then
the additional disk hits are probably not of interest to you- but if you
are relaying the file to multiple hosts, then you should see the
benefit of the product queue.

Steve Chiswell
Unidata User Support

> ------- Forwarded Message
> Date:    Fri, 29 Mar 2002 14:49:24 -0700
> From:    Joe Van Andel <vanandel@xxxxxxxxxxxx>
> To:      ldm-users@xxxxxxxxxxxxxxxx
> cc:      anne@xxxxxxxxxxxxxxxx, Russ Rew <russ@xxxxxxxxxxxxxxxx>
> Subject: LDM performance vs scp?
> This morning, I was benchmarking file transfers using LDM vs copying the
> same file with 'scp'.  On the two files I checked, (1237KB and
> 2446KB),LDM took 3 times as long to send the file compared to scp.
> Since I'm already short of bandwidth on my T-1 line, I'm quite concerned
> that I can not afford to use LDM.
> Has anyone else benchmarked LDM to determine how fast it copies files,
> vs alternatives (ftp, http, scp)?
> Any advice on how to improve LDM performance?
> --
> Joe VanAndel
> National Center for Atmospheric Research
> http://www.atd.ucar.edu/~vanandel/
> Internet: vanandel@xxxxxxxx