Re: LDM Archive Machine

Daryl et. al. --

I've had similar thoughts.  When I brought up our new ldm server (P3 866,
320MB, SCSI, RH 7.1, LDM 5.1.3) I tried to make a large queue (don't
remember how big)  but found it to not allow queues larger than 2 GB.
Don't know what I was doing wrong since you've been able to get one to be
4 GB.  The other problem I had was that when I made a queue close to 2 GB,
the system ran for a day or so and then went into an I/O thrasing
condition which essentially jammed the system.  I haven't figured that one
out yet; perhaps I will try installing the latest kernel patch sometime
and see if that makes a difference.  I currently have a 1GB queue
(although 1.5 or so works okay I think) which holds about 3 hours of data.

Since disk/storage is so cheap, having a large queue seems to make sense
these days.  However, I would agree with Unidata's concerns that if there
was a larger scale outage such that many sites started trying to catch up
on many hours worth of data, this could present a networking load issue.
Hopefully this kind of situation wouldn't occur very often, though.

                                      Art.

On Tue, 18 Sep 2001, Daryl Herzmann wrote:

> Hello LDMers,
>
>          For some time, I have been frustrated by occasional crashes of
> our LDM ingestors during the middle of the night.  The issues have ranged
> from full-partitions, to kernel-faults, etc...  These problems happen to
> most everyone.  What is really frustrating is that when I bring LDM back
> up and start the data flowing, I will have missed data.  By default, LDM
> will request 1 hours worth of data at startup.  By looking at the man
> pages, I noticed that the "-o" and "-m" flags can be set to rpc.ldmd to
> request a longer period of data "-o" and to then process this latent data
> "-m".  For example, "rpc.ldmd -o 7200 -m 7201" would request 2 hours worth
> of data and process it.  I thought this was great, but when I requested
> very old data from an upstream site, I would receive "RECLASS" messages
> meaning that the upstream queue does not go back that long.
>
>          I then started to think about having a LDM machine with a very
> large queue that could hold 12+ hours of data.  I could then fail over my
> recently crashed machine to this archive, catch up my queue and then fail
> back over to my upstream site.  I talked with UNIDATA about this idea at
> the LDM conference, but they were hesitant to support a machine like this,
> because LDM has been designed to deliver near-realtime (1 hour old) data.
> They were not sure of the ramifications of having machines on the IDD
> request very latent data and the congestion/confusion that it may cause.
>
>          I understood the concerns, but was interested in building a large
> queue as well.  I built a 4 Gb queue on a Linux box here and filled it
> locally from our IDD feed.  The machine has been performing very well and
> the queue holds a lengthy archive of data.  With the help of UNIDATA, we
> have been stress testing the large queue and I have been very impressed
> with the performance.  The machine is a Athlon 1.2 GHz with 1 G of memory,
> the hard drive is an EIDE device.  All in all, I have been very happy with
> the archive.  If the machine becomes heavily used by the community, I
> would probably install a fast SCSI drive to help out. Currently, the
> archive is holding 39 hours of data.
>
>       The LDM archive is not considered supported by Unidata.  In
> particular, if your machine is feeding downstream sites you should use
> this very carefully or not at all.  It is possible that taking the time to
> acquire and process large amounts of old data could cause impose enough of
> a load that your machine will not be able to deliver other products in a
> timely manner.  This could interfere with the primary goal of the IDD,
> near real time delivery of data.
>
>          For the specifics.  The machine's name is
> "ldmarchive.agron.iastate.edu" and it is currently requesting the feed
> type "UNIDATA" from motherlode.ucar.edu .  I am allowing all ".edu"
> address to feed from this machine, I will keep a close eye on the logs for
> abuse.  If abuse does happen, I will be forced to be much more
> restrictive...
>
>       What does the community think about this archive?  Is it needed?
> Would people use it?  Does NEXRAD data need to be included in the archive?
> Currently, I do not request NEXRAD data, just because of the sheer volume
> of the data.  If the community would like NEXRAD data available in a
> archive like this, then I will see what can be done.  Maybe using another
> archive machine for just NEXRAD data would suffice.  I don't know.
>
>       The actual process of using this archive involves specifying the
> "-o" and "-m" options to rpc.ldmd.  This can be set by modifying the
> ldmadmin script.  It is important to undo these changes when you are done
> feeding from the archive, so that when you go back to feeding from your
> upstream site, you do not request old data.  So will also need to make
> sure that your decoders will process the old data.
>
>       If anything, this machine makes a convenient emergency failover
> for most people on the IDD.  If you intend to feed real-time data from the
> archive for a short time period, please let me know...
>
>       If you have concerns, please feel free to contact me.  I would
> rather keep some of the conversations private and then post summaries to
> the list...
>
> Later,
>       Daryl
>
>  --
>  /**
>   * Daryl Herzmann (akrherz@xxxxxxxxxxx)
>   * Program Assistant -- Iowa Environmental Mesonet
>   * http://mesonet.agron.iastate.edu
>   */
>
>

Arthur A. Person
Research Assistant, System Administrator
Penn State Department of Meteorology
email:  person@xxxxxxxxxxxxxxxxxx, phone:  814-863-1563