Daryl et. al. --
I've had similar thoughts. When I brought up our new ldm server (P3 866,
320MB, SCSI, RH 7.1, LDM 5.1.3) I tried to make a large queue (don't
remember how big) but found it to not allow queues larger than 2 GB.
Don't know what I was doing wrong since you've been able to get one to be
4 GB. The other problem I had was that when I made a queue close to 2 GB,
the system ran for a day or so and then went into an I/O thrasing
condition which essentially jammed the system. I haven't figured that one
out yet; perhaps I will try installing the latest kernel patch sometime
and see if that makes a difference. I currently have a 1GB queue
(although 1.5 or so works okay I think) which holds about 3 hours of data.
Since disk/storage is so cheap, having a large queue seems to make sense
these days. However, I would agree with Unidata's concerns that if there
was a larger scale outage such that many sites started trying to catch up
on many hours worth of data, this could present a networking load issue.
Hopefully this kind of situation wouldn't occur very often, though.
Art.
On Tue, 18 Sep 2001, Daryl Herzmann wrote:
> Hello LDMers,
>
> For some time, I have been frustrated by occasional crashes of
> our LDM ingestors during the middle of the night. The issues have ranged
> from full-partitions, to kernel-faults, etc... These problems happen to
> most everyone. What is really frustrating is that when I bring LDM back
> up and start the data flowing, I will have missed data. By default, LDM
> will request 1 hours worth of data at startup. By looking at the man
> pages, I noticed that the "-o" and "-m" flags can be set to rpc.ldmd to
> request a longer period of data "-o" and to then process this latent data
> "-m". For example, "rpc.ldmd -o 7200 -m 7201" would request 2 hours worth
> of data and process it. I thought this was great, but when I requested
> very old data from an upstream site, I would receive "RECLASS" messages
> meaning that the upstream queue does not go back that long.
>
> I then started to think about having a LDM machine with a very
> large queue that could hold 12+ hours of data. I could then fail over my
> recently crashed machine to this archive, catch up my queue and then fail
> back over to my upstream site. I talked with UNIDATA about this idea at
> the LDM conference, but they were hesitant to support a machine like this,
> because LDM has been designed to deliver near-realtime (1 hour old) data.
> They were not sure of the ramifications of having machines on the IDD
> request very latent data and the congestion/confusion that it may cause.
>
> I understood the concerns, but was interested in building a large
> queue as well. I built a 4 Gb queue on a Linux box here and filled it
> locally from our IDD feed. The machine has been performing very well and
> the queue holds a lengthy archive of data. With the help of UNIDATA, we
> have been stress testing the large queue and I have been very impressed
> with the performance. The machine is a Athlon 1.2 GHz with 1 G of memory,
> the hard drive is an EIDE device. All in all, I have been very happy with
> the archive. If the machine becomes heavily used by the community, I
> would probably install a fast SCSI drive to help out. Currently, the
> archive is holding 39 hours of data.
>
> The LDM archive is not considered supported by Unidata. In
> particular, if your machine is feeding downstream sites you should use
> this very carefully or not at all. It is possible that taking the time to
> acquire and process large amounts of old data could cause impose enough of
> a load that your machine will not be able to deliver other products in a
> timely manner. This could interfere with the primary goal of the IDD,
> near real time delivery of data.
>
> For the specifics. The machine's name is
> "ldmarchive.agron.iastate.edu" and it is currently requesting the feed
> type "UNIDATA" from motherlode.ucar.edu . I am allowing all ".edu"
> address to feed from this machine, I will keep a close eye on the logs for
> abuse. If abuse does happen, I will be forced to be much more
> restrictive...
>
> What does the community think about this archive? Is it needed?
> Would people use it? Does NEXRAD data need to be included in the archive?
> Currently, I do not request NEXRAD data, just because of the sheer volume
> of the data. If the community would like NEXRAD data available in a
> archive like this, then I will see what can be done. Maybe using another
> archive machine for just NEXRAD data would suffice. I don't know.
>
> The actual process of using this archive involves specifying the
> "-o" and "-m" options to rpc.ldmd. This can be set by modifying the
> ldmadmin script. It is important to undo these changes when you are done
> feeding from the archive, so that when you go back to feeding from your
> upstream site, you do not request old data. So will also need to make
> sure that your decoders will process the old data.
>
> If anything, this machine makes a convenient emergency failover
> for most people on the IDD. If you intend to feed real-time data from the
> archive for a short time period, please let me know...
>
> If you have concerns, please feel free to contact me. I would
> rather keep some of the conversations private and then post summaries to
> the list...
>
> Later,
> Daryl
>
> --
> /**
> * Daryl Herzmann (akrherz@xxxxxxxxxxx)
> * Program Assistant -- Iowa Environmental Mesonet
> * http://mesonet.agron.iastate.edu
> */
>
>
Arthur A. Person
Research Assistant, System Administrator
Penn State Department of Meteorology
email: person@xxxxxxxxxxxxxxxxxx, phone: 814-863-1563