On Wed, 15 Nov 2000, Tim Doggett wrote:
> We are seeing the same thing here at Texas Tech and I have been fighting
> the problem for the last six months. While I have not been able to rectify
> the situation, I think I have isolated the problem.
>
> For about 20 hours out of the day, we see 60+ minute latentcies for
> IDS|DDPLUS and HDS, and accceptable latencies in NLDN, PCWS, FSL2, and
> MCIDAS. Between 4 AM and 8 AM local time all latencies are great (~ 5
> sec). Through dealing with the campus network folks, we have figured out
> that this is the time when our dorm internet usage finally falls off. At
> all other times, the campus line is saturated with student use.
One of my downstream sites exhibits this same pattern, losing data at all
hours of the day except the early morning hours. Before our campus
internet connection was upgraded from a dual T1 to a fractional T3, we
were losing a lot of data as well, although the situation was not quite as
bad, losing data mainly at the times the models were coming in. After
the upgrade, disk utilization on our ldm data partition went from around
42% to 67%, which meant we were losing at least 1/3 of the data we were
requesting.
> To trouble shoot this I have spent a lot of time trying to identify the
> quality of upstream connections, upgraded our ldm ingestor to a SPARC based
> machine instead of an X86, and split the data feeds as is often suggested.
> This worked a little bit, but still leaves us missing data for most of the
> day. I have also tried turning the HDS requests off completely... and this
> had no effect. Nor did splitting the feed so that HDS was separated from
> the other feeds.
Yes, all the various tricks I mentioned in my reply to Jim's query won't
help very much if you are suffering from network congestion at your campus
connection to the internet. I know this because I tried them all before
the upgrade. I sort of hinted that inadequate bandwidth might be the case
in Jim's case, but my feeling was that without specific knowledge of his
local situation, I would mention these tricks and if Jim tried them
and they didn't help much, then suspicion would fall heavily on limited
network bandwidth as the culprit.
> As more dorms are wired, this problem will likely get worse. I have
> complained loudly about this, but unfortunately keeping the students happy
> is more important than facilitating our data needs. I have also been told
> that short of upgrading our network connection, there is little that can be
> done to reduce these latencies.
Yes, this is entirely correct. The real problem here is that for other
users, network congestion usually only means that it takes longer for
a web page to display or a file to download, whereas ldm admininstrators
face actual data loss, e.g., most of the grids missing from a model run.
There is a critical difference between data LOSS and data delay, and
getting the powers that be to recognize that we are impacted much more
negatively by this situation is not easy. As it is, although our network
connection upgrade is less than a year old, as more dorms are wired,
network utilization is back up over 90% during most daytime and evening
hours. I expect that by the end of the spring semester we will be facing
the need for another upgrade.
I wonder if it would be possible if Unidata could come up with some
recommendations for network capacity for various data requests. It does
no good to have the fastest computer in the world if the network data is
trickling through a very narrow pipe. But I think it would be difficult
to do this, since it's not so much the Unidata IDD which requires lots of
bandwith, it's all those users in dorms and elsewhere who are sucking up
most of the available bandwidth.
> Anyway, I can't tell you why the MCIDAS feed (or the FSL2 feed for that
> matter) is different, but we do witness the same effect. If you come up
> with an answer, I'd be glad to hear it.
I'm not sure either, it probably has to do with the fact that although the
individual products on MCIDAS are large, the total number of product/hour
is trivial as compared with WMO. Evidently limited bandwidth impacts WMO
much more than the other feeds.
One other thought: Jim mentioned he would like to turn off his request for
HDS but wondered about the impact on his downstream feeds. Maybe Jim
could ask Unidata to assign his downstream feeds to another site (if
there aren't too many of them). Then as a leaf node he could filter out
as much data as he wanted without affecting other sites.
Tom
------------------------------------------------------------------------------
Tom McDermott Email: tmcderm@xxxxxxxxxxxxxxxxxxxxx
System Administrator Phone: (716) 395-5718
Earth Sciences Dept. Fax: (716) 395-2416
SUNY College at Brockport