What's New with Unidata Internet Data Distribution (IDD)
Related Information
Minor IDD changes and some information. October 1, 1998
-
Total bytes chart y axis was increased from 40 to 80 Gigabytes. The average
daily IDD distrbution of data is over 60 Gigabytes. There are
on average 105-108 sites sending LDM statistics every hour. I did an unique on
the FOS routing page and ended up that there are ~142 sites being feed with the
IDD. Also, an Executive Summary of IDD
Reliability is available about the IDD.
Revoked code for Satellite data filtered out in latency charts. December 8, 1997
- After studying the data again, the satellite data should be included.
Satellite data filtered out in latency charts. December 5, 1997
- The satellite data might have been making the IDD appear it was
performing better then reality.
IDD feed charts processing changed. Now more accurate. November 6, 1997
- The irregularity of the charts was caused by sites in catch up mode or
restarting, etc. The new processing takes the highest freqency of the number
of products received from all sites. For example if 50 sites received 1000
DDS products and 3 sites received 1100 DDS products, the new processing would
use the 1000 DDS product data. The old processing would use the max data,
the 1100 DDS product data. That's not necessarily the correct representation
of the DDS feed. Also code was added to filter out satellite data that is not
sent over to any IDD sites. This was causing the WSI feed to be over stated
by a factor of 4.
Anomaly of Max latencies June 2, 1997.
- The anomaly of high readings in the "Max IDD daily latencies" chart was caused
by the change of the statistical data filtering. The time period from the
beginning of Feb to the end of May, the data was unaltered. The filtering
process included the elimination of sites that sent statistical messages
of "Down" or "None". In the old type statistics based on percentages,
"Down" and "None" could be factored into the overall percentages of
total products being received over the entire IDD system. Since the new
statistics are based on product latencies, this information is not relevant.
Another part of the filter was the elimination of multiple messages. It's
possible to receive the statistics more than once for the same time period if
the site is falling behind in receiving data. The latest messages contain the
most recent statistical data, the message to retain. These problems caused the
statistics to over state the numbers in the graph for this period. Because
the raw data is only kept for a week, it was impossible to recalculate the
charts. The time period from the beginning of the graph to Feb was calculated
using Unix command line functions that included the necessary data filtering,
therefore the graph numbers are realistic representation for that period.
IDD status as of Friday March 28, 1997.
IDD status as of Wednesday January 29, 1997.
- Announce that LDM latencies charts are now available on the web.
The routing reports now show the latest latencies instead of the
percentage of products received for the hour.
There are charts that show the current hourly and daily latencies with
a zoom showing the number of site-feed combinations having problems.
Other charts are available for interesting time periods, ie, the early
January period with the large internet outage. The charts with the
product percentages has been moved to a separate page because they
were becoming skewed by complex site feeding arrangements.
IDD status as of Wednesday October 16, 1996.
- Announce that LDM4 protocol feeds will cease on December 15,
1996. This is to allow testing of new LDM5 functionality.
LDM4 protocols will no longer be supported as of this date.
IDD status as of Monday April 15, 1996.
IDD status as of Friday April 5, 1996.
- We now have 120 sites using the IDD.
IDD Status as of Tuesday February 13, 1996.
- LDM5 was released to the general Unidata community today.
IDD Status as of Monday Feburary 12, 1996.
- The physical location for all the LDM statistics and
IDD topology has been moved from under the gopher directory
structure to /home/ftp/pub/idd . This move is in anticipation
to the decease of the UPC gopher because the information is
outdated, not being updated, and the current information is now
available via UPC Web server.
IDD Status as of Monday January 29, 1996.
- The disk which stores the ldmstats data was failing and in the
process of tranferring the data to a different disk, some
of the data was lost. The actual amount of data lost is between
2-3 hours for some sites. This resulted in the LDM statistic
reports and charts being skewed for Saturday and Sunday, the 27
and 28.
IDD Status as of Friday January 12, 1996.
- The FOS and McIDAS topology routing reports are
again working with the installation of LDM5 beta release 23.
The reports are not complete because some sites are using
the short hostnames and others are not sending LDM statistics
to UPC.
- The DDS, PPS, DDPLUS entries in the text reports have been
combined to one entry DDPLUS because it is now possible to
receive the same feedtype from different source simultaneously.
IDD Status as of Wednesday January 10, 1996.
- Yesterday's LDM5 beta release 23 is being installed
at many relay sites today. This release has
the new facilities needed to reinstate the
routing topology tables on the Unidata
WWW server, so you should be able to determine
how data are being routed once again from
those charts.
- Changes are being made in the IDD statistics
processing system, so that sites with redudant
IDD and Ku-band ingest (e.g., LSU and thelma
at UCAR) will correctly report when 100% of
the DDS and PPS products are captured.
For some time now these sites have been
reporting less than 100% when in fact the
total of DDS, PPS and DDPLUS (From Ku band)
actually added up to all the products.
IDD Status as of Thursday December 21, 1995.
- The LDM5 beta release 22 continues to perform
well at nearly all test sites. There are still a few
reported problems of intermittant failures
on a couple platforms. Currently the UPC
is trying to duplicate those problems on
those platforms in house in order to track
down the cause.
- There are a few things to be aware of
regarding the statistics maintained
by the UPC. The LDM 5 reports stats
differently from the LDM 4. Some of
these difference "broke" the system
which generated the chart which showed
which sites were feeding which others
and what percentage of the data were
captured at each site. We are working
on updating that systems and restoring
the table on the www server, but it
may take some time.
- LSU and UCAR's thelma are now taking in
DDPLUS data from their Ku feeds as well as
DDS and PPS from the IDD. Soon Cornell soon
will be doing so as well. The LDM 5 rejects
duplicate products automatically so this is
a handy way to gain some added backup
for those data feeds. However, this approach
makes it appear that not all the products
are being captured when in fact they are.
Here again, we are working on a fix, but
in the meantime, just keep in mind that
those sites (and sites downstream) will
be reporting lower percentages for those
feeds when they are actually getting 100%
of the data.
IDD Status as of Wednesday December 20, 1995.
- Yesterday was the first full day running the
LDM at Alden (and at a few of the relay sites)
with Glenn's revised code for handling the
product queue. The results showed a dramatic
improvement in data delivery--especially in
reduced delays in the system. It looks
like the major difficulty with the LDM 5 has
been addressed. Latencies today continue to
be less than a few minutes at the sites
we have been monitoring.
- Thanks to all the sites participating in the
LDM 5 test and for the patience of those
who endured the difficulties of the
period. Special thanks go to Harry Edmon
at Washington, Peter Neilley at RAP, and
Jim Cowie at COMET who provided Russ Rew
with the evidence needed to pinpoint the
problem. Glenn's quick solution seems
to have done the trick.
- At the UPC, we plan to utilize what we learned
from this test to improve future testing if
possible, but unfortunately some of the
problems simply don't show up except under
live conditions in the field.
IDD Status as of Monday December 18, 1995.
- The LDM 5 beta test continues. Observations
by Harry Edmon at Washington, Jim Cowie at
UCAR COMET, and Peter Neilley at NCAR RAP
indicate that the LDM 5 is slower at inserting
products into the queue as the queue grows
larger. Russ Rew has found areas where tha
part of the LDM 5 code might be made more efficient.
While it may take some time to implement and test
the needed changes internally, it is being worked on at
high priority within the UPC.
- A few sites are providing www files which show the
latencies of products arriving at their LDM.
Rob Cermak makes available the latency information from the
latest
Rutgers stats file.
At Hawaii, Steve Chiswell created a history of
Hawaii's latency tables and charts.
At Cornell, Bill Noon has a series of latency charts
with the date incorporated into the name of the
URL. Take a look at
today's Cornell latencies.
All these charts and tables are updated
at least hourly. Most are updated every few
minutes.
IDD Status as of Thursday, December 14, 1995.
A field test of the beta version of the LDM 5 has been underway
at some of the relay sites as well as at Alden and SSEC for several
weeks now. At this point most of the top tier relays are
running the LDM-beta20 version of the code. The following
observations can be made at this point:
- The LDM 5 works differently from the LDM 4 in
that the LDM 4 would reset to the head of the
product queue when it got 1 hour behind in
delivering data. Thus the downstream
nodes would lose an hour's data .The default behavior for
LDM 5 is that some products are lost
after it gets behind and hour, but it does
not reset to the head of the queue. The
system does the best it can until it catches up.
- Delays in getting products out to IDD nodes
appear to have increased during the test
period. Some of this may be due to the
different behavior of the LDM 5, some
of it due to network problems, some due
to increased load on relay machines, and we
continue to investigate the LDM 5 itself
to determine what affect the new architecture
is having on product latency.
- Two problems with the LDM 5 were uncovered
in the field test and fixed.
- Several sites are actively working on tests
to clarify the situation. Bill Noon at
Cornell has done comparisons of LDM 4 and
LDM 5 at his site. Dave Wilensky at LSU
is using a new feature available in the
LDM 5 which allows him to ingest data from
the IDD and from his Ku satellite and automatically
use the first copy of a product to arrive.
Harry Edmon is actively checking into the
performance of the underlying Internet.
David Wojtowicz at Illinois is helping to sort
out problems with the HP port of the beta
code.
NCAR RAP and COMET staff are investigating
the LDM 5 in an environment where the
underlying network bandwidth is very high.
This field test was preceeded by extensive internal testing.
Some of the problems simply don't show up until the system
is placed in the field. Of course that requires patience
on the part of the community using the data. We understand
your needs for reliable
and timely data delivery and continue
to pursue that high priority objective.
--Ben
IDD Status as of Monday, November 27, 1995.
- IDD participation is up to 104 sites. Sites have been
contacted and are responding to the plan for the transition to total
(including the Unidata/Wisconsin satellite broadcast) Internet Data
Distribution by Dec 31, 1995.
IDD Status as of Thursday, November 9, 1995.
- E-mail reminders were sent this week to sites not yet participating in
the IDD.
- Several responses have been received bringing the total
participation to 101 sites. See Deployment Status
IDD Status as of Wednesday, October 25, 1995.
- We have been in the process of reconfiguring the IDD topology. That
effort has now been put on hold. The reasons are a shortage of second
tier relay sites and continuing problems on the internet backbone.
The latter is causing connectivity problems with various IDD nodes,
requiring feed changes to remedy the situation. This is causing mass
confusion and any attempt to reconfigure the topology on a large
scale is almost impossible. Once the internet backbone issues settle
down and other sites come are available for upgrading to relays, we
will go back to reconfiguration effort.
IDD Status as of Friday, September 8, 1995.
- A start has been made on a comprehensive
Site List. This list is under construction
and will completed as the topology reconfiguration is implemented. At
present, it only has information on the data source sites and the first
tier relay nodes.
- The topology reconfiguration is proceeding, although it is taking
longer than expected. Beyond the testing and evaluation of potential
sites, there are problems that have to be corrected that only show up
as the reconfiguration is put in place. This is delaying the overall
schedule, although we are making continual progress. So far, the
first tier sites are in place and we are in the process of moving the
leaf nodes that get fed from them on to second tier sites.
IDD Status as of Wednesday, August 30, 1995.
IDD Status as of Friday, August 25, 1995.
- Toplogy Reconfiguration
- Oklahoma has been moved to the top tier
- Millersville has been moved to the 2nd tier
- Comet and Unidata have been moved off of the top tier
IDD Status as of Tuesday, August 15, 1995.
All hostnames are converted to lower case for all LDM statistics reports and charts. There is a new long term report for the period 11/01/94 to 08/15/95.
IDD Status as of Monday, August 14, 1995.
- Currently 95 sites participating in the IDD.
IDD Status as of Wednesday, June 28, 1995.
- Obtained status from 20 sites. IDD Deployment status and Unidata
sites lists have been revised.
IDD Status as of Friday, June 16, 1995.
- To date contacted 12 sites to obtain status and plans for IDD
participation and provide notification of the cessation of the
Unidata/McIDAS satellite broadcast on December 31, 1995.
IDD
Status as of Friday, May 26, 1995.
- We have created a strategy for contacting
non-participating IDD sites.
IDD Status as of Tuesday, May 23, 1995.
- We have created a plan and timeline for implementing the
IDD topology reconfiguration.
With the NSFnet transition completed, it is now time to start
this process.
IDD Status as of Friday, May 19, 1995.
- Consolidated the web pages that have links to the various IDD site
logs into a single web page.
IDD Status As of Friday, May 12, 1995.
- Alden changed their system for feeding data from the
higher speed HRS feed into their LDM and this appears
to have fixed the problems with missing HRS data
that surfaced at the switchover to the higher
speed last week.
- We have determined that the radical change in
late April in the WWW chart showing daily percentages
of data sites receive daily is due to a change in
the way the sites not reporting data are handled.
This corrects a problem that was introduced in
January. The current system shows the
actual percentages sites are receiving on a daily
basis accurately.
IDD Status As of Friday, May 5, 1995.
- The feed number and byte charts have been modified to
only contain the designated feeds in the html tag. Also the
WSI(NIDS) feed was given it's own charts because of the size
of the feed. The long term Cumulative and Daily Percentages(text)
report was modified to make more accurate reporting.
- Alden determined that a cable in the system
receiving the HRS stream was not able to handle
the new 57.6 kb/s stream and some of the products
were being lost and truncated. A new cable was
installed this morning.
IDD Status As of Tuesday, May 2, 1995.
- Alden switched to high speed (57.6 kb/s) feed for
HRS data stream for the IDD. The old 19.2 stream is
going out over the Ku band satellite feed, so there is
a difference between the two streams as of today.
In particular, none of the RUC products are going
out on the Ku satellite feed.
- Modifications to ldmstats script to display number of
sites that are not receiving data. The last code change expanded
NO_DATA to the actual feeds the sites were receiving and therefore
the NO_DATA sites not counted.
IDD Status As of Sunday, April 30, 1995.
- Today is the deadline for the switchover from the old
NSFNet backbone. If everything went according to plan,
there should be no data flowing on the NSFNet backbone
as of tomorrow. There have been problems with IDD
reliability over the last several months, but the system
has been improving gradually since the low point in
February. Of course, we have to continue to monitor
is closely as the commercial Internet service providers
adjust and modify their networks. It seems the new
tools at IDD sites will be a help in this regard.
- Another reminder that IDD nodes should consider applying
for NSF Equipment grants. The announcement appeared in our
most recent newsletter and is referenced in the WWW copy of
the newsletter with a pointer under "Other News" on the
Unidata homepage. Note that contributions to the IDD,
such as serving the community as a relay node, alternate
source site, or in some other capacity are among the
criteria for evaluation. If your idd hardware is
overburdened, get in your proposal to NSF by the
June 1 target date.
IDD Status As of Friday, April 28, 1995.
- Added the UTC time stamp to all the Feed Routing Reports.
- Added a pre ldmstats script to expand the NO_DATA feedtype to the actual feeds.
IDD Status As of Wednesday, April 26, 1995.
- Added a page for the IDD Site Netcheck Logs.
IDD Status As of Thursday, April 13, 1995.
- Added DIFAX Routing report with reception percentages.
IDD Status As of Tuesday, April 11, 1995.
- The FOS, McIDAS, and NLDN Routing reports now have the reception
percentages appended to the site name. The sites with no stats
reports are tagged with "unknown %". Work is in progress to create
a DIFAX routing report with the same information.
IDD Status As of Monday, April 10, 1995.
- The hourly text reports now process the DIFAX feed type
percentages on the site bases as well as the overall percentages
at the end of each hour. The Feed charts both hourly and daily will
now restart capturing the DIFAX product numbers and bytes. This
is the result of sites installing the latest LDM release 4.1.42 which
provides the byte information in the ldmstats reports. The list
of active sites changed, glory.tamu.edu now inactive, and
cheshire.cat.syr.edu now an active level 2.
IDD Status As of Wednesday, April 5, 1995.
- You may have noticed that the IDD performance
charts on our WWW are not being updated. Unfortunately
Robb is out sick and we have been unable to figure
out why the scripts which generated the plots are
failing.
IDD Status As of Tuesday, April 4, 1995.
- IDD performance over the weekend suffered
in certain regions because two relay nodes went
down. As of yesterday, Monday, morning, those
relays were back up and the IDD has continued
to be reasonably reliable since then.
- We got quite a bit of feedback from the
ldm-usrs mailing list regarding the possibility
of having groups of sites work in teams to
help each other out in monitoring their IDD
systems. Several sites apparently are already
beginning to work together in that fashion. Others
have to deal with departmental and security issues
before they could participate. There were some other
suggestions for alternative mechanisms that would
improve LDM administration. Some of these have
been incorporated into our UPC development plans.
IDD Status As of Thursday, March 30, 1995.
IDD Status As of Saturday, March 25, 1995.
- "It was the best of times; it was the worst
of times" as Charles Dickens to aptly put it.
Or was it Shakespeare? Geraldo???
Whatever!!! After setting performance
records on Wednesday, the system stumbled
badly Friday due to serious problems in
MCI backbone routers that lasted for
several hours. The failures were such
that the LDM at Alden could not get data
to top-level relays for over an hour at
a time so the automatic time out and
queue reset occured several times.
This was a widespread outage so it's not
clear whether having an alternate source
site (one of the sites receiving data
via Ku, for example) would have helped
much. However, we're still looking for
sites who receive data via Ku to act
as alternate FOS source sites when there
are network outages near Alden.
IDD Status As of Wednesday, March 22, 1995.
- Yesterday was the best day ever in terms of
the IDD getting more than 99% of the products
to a large number of sites. This occurred
in spite of one case of an ingester dying
on the Alden machine. We anticipate
additional improvements when we fix the problem
causing intermittant restarts at Alden.
However, it would not be surprising to encounter
additional network problems now and then since
the deadline for completing the switch away from
the NSFNET backbone is the end of April.
IDD Status As of Tuesday, March 21, 1995.
- Overnight the IDD performed as reliably as it did in
the good ol' days of last December. As usual though,
it'll be important to keep an eye on the system
as the network load increases during the day.
- We'll be updating our IDD site lists to include
some of the new relays, so you'll probably see
an incrase in the number of relays in the charts
and possibly a decrease in the number of relays
getting 100% reliable delivery, since some of the
newer relays are not completely reliable yet.
- Several of last week's problems have been
taken care of, but a few new ones popped up
over the weekend.
- Data flow into Rutgers
appeared to be solid on Friday and over the
weekend. Rob Cermak is back and will investigate
why things went bad and also why they improved
on Friday.
- Texas A&M went out over the weekend. A post
mortem will take place there after the system is
restarted.
- At 15 UTC, all the relays are back up and
running. Now well see how things progress as
the Internet load increases later in the day.
IDD Status As of Monday, March 20, 1995.
- Several of last week's problems have been
taken care of, but a few new ones popped up
over the weekend.
- Data flow into Rutgers
appeared to be solid on Friday and over the
weekend. Rob Cermak is back and will investigate
why things went bad and also why they improved
on Friday.
- Texas A&M went out over the weekend. A post
mortem will take place there after the system is
restarted.
- At 15 UTC, all the relays are back up and
running. Now well see how things progress as
the Internet load increases later in the day.
IDD Status As of Saint Patrick's Day, March 17, 1995.
- Please, everyone knock on wood this morning.
Not that we're superstitious, but the IDD did settle
down overnight. The machine switchover at Illinois
is complete. It also appears that delivery to the
relays at Rutgers and LSU has improved, but it remains
to be seen whether that holds throughout the typical
Friday afternoon network congestion.
- Sure enough, the delivery problems to LSU are
beginning to resurface as the day progresses.
IDD Status As of Thursday, March 16, 1995.
- Many thanks to Steve Finley, Dave Wilensky, and
other Ku-band sites who offerred to make their
data available to sites who had missed the data
because of the IDD outage.
- The system at Alden settled down after we backed out
of the debugging code we had in there. There were no
subsequent restarts there overnight, and we are
investigating ways to create a stress test environment
without using Alden. Karen is also working on getting
their systems up to the Solaris operating system as
soon as possible.
- However, there were problems with data delivery
overnight. They were mainly due to an outage at
the University of Washington relay and apparently
a failure of some downstream nodes to reconfigure
to use data.atmos.uiuc instead of
- Continuing network problems between Michigan/Merit
and Rutgers as well as between Alden and LSU are being
pursued with the network providers in those areas.
IDD Status As of Wednesday, March 15, 1995.
- Last night starting at 3:37 UTC, we experienced the worst
Family of Services (FOS) data outage since we started
the IDD deployment. A number of factors contributed to
the problem, but the main reason that the outage lasted
so long was a decision by yours truly, Ben Domenico,
to leave some of the Alden IDD monitoring and restart
systems turned off so we could track down a problem
that was requiring LDM restarts with some data loss
every few days. This was a mistake which unfortunately
caused major problems for Unidata IDD users. My
sincere apologies.
- The University of Illinois switched machines from
wx3.atmos.uiuc.edu to data.atmos.uiuc.edu. Since this
is a top-level relay, it's taking a while for all
the downstream nodes to adapt and for the statistics
programs to gather the data on reliability of the
newly configured system.
IDD Status As of Tuesday, March 14, 1995.
Three additional relays were added to the list which is
used to create the charts which show reliability of the
relays. As it turns out these were some of the less
reliable relays recently, so those charts may show
worse overall performance than before.
IDD Status As of Thursday, March 9, 1995.
Robb made several significant changes to the performance
statistics reporting. Some of these will affect the
appearance of the long term charts.
- Some sites using the new LDM to capture data
off the Alden Ku-band system had been included in
the IDD statistics tables and charts. These are no
longer inlcuded, so it will appear that fewer sites
are participating.
- The numbers of DIFAX and WSI NIDS products is now
included in the tabular charts. Since we don't have
a mechanism in place for determining how many products
were sent from the source, no percentages can be
calculated yet. Also note that the routing topology
is different for these products than for the FOS and
McIDAS.
- The long term charts were becoming unreadable
because there were too many days to include in one
chart. These are being broken up into quarterly
increments, so you'll have to view more than one
chart to get a picture of IDD performance since
the beginning.
Brief updates on problems we are tracking:
- The statistics gathering program at Alden
continues to "hang" every now and then. The LDM
continues to run fine and deliver data, so we've
instrumented pqbinstats at Alden and will continue
monitoring it.
- Transmission between Alden and LSU is failing
fairly often. This is being tracked, but it may
require a change in routing.
- A more sophisticated warning system at Alden is
resulting in far fewer LDM restarts and a more
reliable IDD overall.
IDD Status As of Tuesday, March 7, 1995.
There was a major network outage at NEARNET this afternoon
between about 17:30 UTC and 18:45 UTC. Since it cut
off Alden completely from the rest of the Internet
for over an hour, all IDD sites lost FOS data
during the outage. On the other hand, the LDM at Alden
behaved as expected. It tried for an hour to send the
data in the queue, gave up at that point, but then
started sending data again as soon as the sites reconnected
after the network connection was reestablished.
IDD Status As of Monday, March 6, 1995.
- Robb and Mitch have been working with Karen
Wallace at Alden to troubleshoot the system there
which has many ingesters bringing data in
and many downstream nodes receiving the feeds.
One positive change resulting from this is a
better mechanism for letting the Alden
operations staff know when to restart the system.
Also, the restarts now force a deletion of the
queue which takes care of problems with
the queue being corrupted. This has resulted
in a more stable system at Alden, but still
leaves us attempting to determine the cause of
ingester failures and corrupted queues.
- The UPC systems staff has set up a test
IDD at Unidata with many ingesters to try and
duplicate the infrequent, but troublesome problems
with corrupted queues and failed ingestors.
Hopefully the test system will allow us to track
down the cause of those problems without
interfering with the operational IDD node at
Alden.
IDD Status As of Wednesday, March 1, 1995.
Alden changed their LDM configuration file so the queue
gets deleted when the LDM is restarted. This should
do away with the problems of queue corrption after
restarts. It will be useful to monitor the delivery
statistics to see if this makes a difference in the
effectiveness of delivery to the top-level relays.
IDD Status As of Friday, February 24, 1995.
- Over the last several weeks, we've noticed
intermittant degradation of the IDD reliability. At the
time, we ascribed it to major changes in the Internet
structure. After delving more deeply, however, it appears
that there is more to it than that. We're continuing to
diagnose the problem and it's slow going because it shows
up at random times and places in the topology. But, if we
can detect a pattern, we may be asking some of the relay
nodes to gather more verbose logging information and
possibly put in other diagnostic probes to help us localize
and solve the problems.
- On further investigation, one cause of the
difficulties was uncovered today. The Alden LDM
was periodically overrunning its queue. Apparently
the increased data flow requires a larger queue.
They've upped the queue size and restarted the LDM
this morning. Administrators at relay nodes should
also be checking to make sure their LDM queue sizes
are adequate. An email note will go out to all
IDD sites indicating how to monitor queue usage.
IDD Status As of Monday, February 6, 1995.
Several significant problems cropped up for
the IDD during the last week. There were
a number of causes:
- a portmapper problem at UCLA whose
cause is still unknown but under investigation.
- a corrupted queue on the LDM at Alden which
has been fixed.
- changes in the Internet to accomplish
the transition away from the NSFNET backbone.
Last week NEARNET was attempting to switch to
a new carrier.
- difficulties related to the
storm itself--mainly the fact that NEARNET
still depends on microwave links for some
of their network connections. These links
degrade during severe weather and affect the
data transmission out of Alden and WSI. NEARNET
has a plan for systematically replacing these
microwave links, but it won't happen immediately.
In the meantime, we are working to establish an
alternative FOS source site from among those
sites who receive the data via Ku band.
- several of our relay sites also run
WWW weather servers. The demand on these
servers increases dramatically during
severe storm situations. There is some
suspicion this may adversely affect the
delivery of the raw data. It may be
necessary to work together as a community
to increase the number of such weather
servers so the demand is spread out more
evenly.
In summary, there are still substantial difficulties
to work out for the data distribution system, but they
are being worked on.
IDD Status As of Wednesday, February 1, 1995.
- NEARNET is in the process of switching over
from the ANS (NSFNet backbone) connection
to MCI service. This affects service out
of both Alden and WSI and may be responsible
for overall degradation in performance over
the past week.
- Data losses at the Washington relay caused
most of the West Coast sites to miss some
data overnight last night. The problem
apparently corrected itself and has not
recurred. The cause of the problem is
still unknown.
IDD Status As of Tuesday, January 24, 1995.
- In re-examining the system which gathers and
archives the IDD performance statistics, Robb
discovered that the the daily counting system
changed somewhere in the latter half of December
so the early December status show more site i
getting 100% of the data than in more recent days.
Unfortunately we don't have a way to recover
the more accurate information for the earlier period.
We are going to leave the charts as they are because
they do correctly show how the number of sites increased
over the implementation period.
- It's worth noting that the IDD reliably delivered data
from the Internet sources to the Unidata exhibit over
the temporary network connection set up for the
AMS convention at the Anatole hotel in Dallas.
IDD Status As of Wednesday, January 18, 1995.
- The main server machine at the Unidata
Program Center failed this morning. Consequently
some of our statistics plots are not available.
Please bear with us while Mike Schmidt flies
back from the AMS meeting to help get our
server back up and running. Note that this
is not affecting the performance of the IDD
itself. The data are still flowing, but some
of the reliability plots are not available.
IDD Status As of Monday, January 16, 1995.
- Not much change in the IDD performance over the
last week. The extended holiday weekend does
not seem to be having a significant effect
either.
- Many of the UPC staff are down in Dallas for
the AMS convention. If you look into the
routing topology files, you'll see the
extra system, ssec-ams2.amsmeeting.org.
That,s the LDM in the exhibit booth
at the convention.
- Robb continues to improve the graphical displays
of system performance. Providing a concise and
accurate represtation of how well the IDD is
doing has turned out to be a more challenging
task than anticipated.
IDD Status As of Monday, January 9, 1995.
- The IDD appears to have reached a "steady state"
with about 50 sites participating and somewhere
between 40-45 sites getting every data product in
any given hour.
- Besides having worked out the major kinks in
LDM software and adjusted to the ideosynchrasies
of the changing Internet, the IDD participants
have developed a set of operational procedures
for adapting to scheduled and other interruptions
in service from upstream sites.
- Given everything we have learned about the LDM,
the Internet, and the participating relay sites,
we are now planning to revise the routing topology
and then begin bringing online some of the sites
which postponed implementation for various reasons.
IDD Status As of Tuesday, January 3, 1995.
- Robb has put together a set programs which give a
graphical snapshot of IDD performance. These
charts now provide hourly and daily information regarding
how many sites are running and how reliable the
delivery is.
- Last 24 Hours Histogram
- Last 24 Hours Percentage
- Daily Histogram
- Daily Percentage
- As the charts show, delivery over the holiday period
continued to be very reliable for about 40 sites.
Power and air conditioning problems at Texas A&M kept
their machine offline for a significant period.
- The feed into New Mexico Tech still seems to be
less reliable than most other--even with reduced
network traffic over the holiday. This warrants
a specific check.
IDD Status As of Monday, December 19, 1994.
60 sites are running and reporting statistics
7 are running, but not reporting stats yet.
20 have installations underway (3 are OS/2 only).
- We've gotten quite a bit of feedback from our test
and deployment sites, giving their impressions of
the IDD as it is now running. The email messages
are logged in Comments
from IDD User Sites . Overall the comments
are positive and we continue to work on the
problems noted in the messages.
- Be aware that the counts are now individual sites.
We're attempting to count sites rather than machines
even though some departments are running the LDM and
reporting statistics for several machines in the
same department. If you're interested in the
status of an individual site, you can check in
the IDD Site Deployment
Status.
IDD Status As of Monday, December 12, 1994.
57 sites are running and reporting statistics
9 are running, but not reporting stats yet.
21 have installations underway (3 are OS/2 only).
- Be aware that the counts are now individual sites.
We're attempting to count sites rather than machines
even though some departments are running the LDM and
reporting statistics for several machines in the
same department. If you're interested in the
status of an individual site, you can check in
the IDD Site Deployment
Status.
- Four of the sites in the above list are sources sites.
That means that 62 sites are now receiving FOS data
via the IDD. Hopefully we can get the other
21 running by the end of the year which would
give us a total of 83 sites receiving data.
- Note that most of these sites are receiving nearly
twice the volume of data they had been receiving via
C-band satellite because they are now acquiring
the HRS datastream.
IDD Status As of Monday, December 5, 1994.
89 IDD machines we are trying to get running at 74 LDM sites.
62 are running reasonably reliably.
51 are reporting statistics.
- Rather than try to keep track of sites in terms of which
specific queues they were in for last month's rapid
deployment, we are now just listing the numbers of nodes
we are trying to get running, the number running and
the number reporting statistics. Note that these
are LDM sites. The OS/2-only sites are being handled
separately.
- So far we haven't identified any sites which are in
in dire need of data that are missing out because
their IDD system is not running yet. If there are
such sites, they should contact support@unidata.ucar.edu
as soon as possible.
- On the OS/2 front, there is some extremely encouraging
news. Steve and Tom have managed to get a special OS/2
LDM module running that enables an OS/2 system to
capture the McIDAS data from an upstream site which
is relaying the data using the LDM. With this OS/2
module, there is no need for the upstream node to
run any special software other than the LDM.
Currently the module only runs on OS/2 systems
with the IBM TCP/IP package. We are working with
FTP, Inc. to get an upgrade to their package so
the module will run with their software as well.
In the meantime OS/2 sites using FTP, Inc. software
continue to be fed reliably using mctingest.
- Linda Miller has put a complete Unidata site list into
the Unidata Web server in the IDD part of the documentation.
- Robb has a new package that creates text version of the
IDD routing topology--updated by the new statistics
gathering package. As soon as enough sites implement
the new stats package, the resulting routing topology
files will be made available on the Unidata Web server.
- Two major problems over the weekend caused the statistics
reporting to fail. One was a severe network outage at
ANS sites. This lasted about 2 hours and cut off
our data sources at Alden and WSI. Also the statistics
reporting program failed on the Alden computer so
many hours of statistics were lost for the Family of
Services even though the data were flowing reliably.
NCB Day. THIS IS IT!! (or is it?)
Status Green: Thursday, December 1, 1994.
IDD Deployment Site Status Summary:
20 reporting statistics.
7 running, but NOT delivering statistics.
15 working on LDM/IDD or Mctingest implementation.
6 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
0 awaiting UPC installation.
2 haven't responded to email and phone calls.
- Well. True to form the big target date arrives and the
statistics reporting system fails at our primary data source
site. That will be rectified and we'll get the
statistics reports updated later in the day.
- The updated hourly statistics reports are summarizing
how reliable the overall system is doing at the
end of each hour. One problem we have is that nearly
20 sites that are up and running (or were at one time)
are not sending in statistics reports. It would be
very helpful if those sites would get their statistics
reporting systems running or let us know that the
systems are off line so we can remove them from
the reports.
- Independent hourly checks of the machines which are
supposed to be running the LDM as part of the IDD
indicate that about 65 machines are now up and
running. This is up from about 50 a week or so
ago. Due to network problems and the fact that
some sites are still configuring their systems,
the number fluctuates from hour to hour, but
that's a pretty accurate measure of how many
LDMs are gathering data via the IDD.
- This report will soon evolve to a different form--perhaps
updated less often and treating the system as a whole
rather than focusing so much on the sites in the
rapid deployment queues.
NCB Day minus 0
Status Green: Wednesday, November 30, 1994.
IDD Deployment Site Status Summary:
20 reporting statistics.
5 running, but NOT delivering statistics.
17 working on LDM/IDD or Mctingest implementation.
6 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
0 awaiting UPC installation.
2 haven't responded to email and phone calls.
- Groucho, the main network server machine at Unidata
was down most of the night. As a consequence most
of the statistics reports were missing this morning.
If Robb"s stats program work as planned,
this will correct itself as the day goes on and the
backed up mail is delivered to groucho.
Thanks for your patience.
- Early reports from the New York State sites indicate
that the reconfigured routing in that region is resulting
in much more reliable data delivery to all those sites.
NCB Day minus 1
Status Green: Tuesday, November 29, 1994.
IDD Deployment Site Status Summary:
18 reporting statistics.
4 running, but NOT delivering statistics.
19 working on LDM/IDD or Mctingest implementation.
6 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
0 awaiting UPC installation.
3 haven't responded to email and phone calls.
- Not much change in the overall system. Performance
went down somewhat yesterday due to network congestion
returning after the holidays. As we continue to work
around weak points in the network, these problems
should gradually lessen.
- As noted above, there are still quite a few sites
not running yet, but we have been in touch with most
of them and they are working on it.
- Substantial rerouting is being done for the
New York State sites today. We hope this will result
in significantly better reliability.
NCB Day minus 2
Status Green: Monday, November 28, 1994.
IDD Deployment Site Status Summary:
17 reporting statistics.
4 running, but NOT delivering statistics.
18 working on LDM/IDD or Mctingest implementation.
6 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
1 awaiting UPC installation.
4 we haven't heard from yet.
- Not much change in the overall system
the sites reporting statistics received 99.5%
of the products on the average. This was followed
by two days at 98.8%. Today we'll be back to
more typical weekday network congestion, so
we'll see what results.
NCB Day minus 3
Status Green: Sunday, November 27, 1994.
IDD Deployment Site Status Summary:
17 reporting statistics.
4 running, but NOT delivering statistics.
18 working on LDM/IDD or Mctingest implementation.
6 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
1 awaiting UPC installation.
4 we haven't heard from yet.
- Not much change in the overall system from
yesterday. The CDC
relay came back online and its downstream
nodes are back as well.
- UCLA statistics reports indicate a data delivery
problem there.
NCB Day minus 4
Status Green: Saturday, November 26, 1994.
IDD Deployment Site Status Summary:
17 reporting statistics.
4 running, but NOT delivering statistics.
18 working on LDM/IDD or Mctingest implementation.
6 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
1 awaiting UPC installation.
4 we haven't heard from yet.
- Yesterday a few LDM systems dropped out of the statistics,
but the remaining 55 again reported receiving
99.5% of the products sent from the sources on
the average.
- It appears that the relay node LDM at the
NOAA Climate Diagnostic Center went down and
took the downstream nodes at CU Boulder and
the University of Wyoming with it.
NCB Day minus 5
Status Green: Friday, November 25, 1994.
IDD Deployment Site Status Summary:
17 reporting statistics.
4 running, but NOT delivering statistics.
18 working on LDM/IDD or Mctingest implementation, including
4 recent additions.
6 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
1 awaiting UPC installation.
4 we haven't heard from yet.
- While most of you were off enjoying a Thanksgiving
celebration, 58 selfless, hardworking
LDM systems around the
country were delivering weather data over
uncongested IDD network connections. On the
average, sites reporting statistics received
99.5% of the products sent from the sources.
NCB Day minus 6, Have a Happy Holiday
Status Green: Thanksgiving, November 24, 1994.
IDD Deployment Site Status Summary:
17 reporting statistics.
4 running, but NOT delivering statistics.
18 working on LDM/IDD or Mctingest implementation, including
4 recent additions.
6 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
1 awaiting UPC installation.
4 we haven't heard from yet.
- Happy Thanksgiving. So far things are running
will with the "holiday staff" in charge.
- Alternate routing seem to be working--even
durning the afternoon crunch yesterday--for LSU
and for the New York area.
NCB Day minus 7
Status Green: Wednseday, November 23, 1994.
IDD Deployment Site Status Summary:
16 reporting statistics.
5 running, but NOT delivering statistics.
16 working on LDM/IDD or Mctingest implementation, including
two recent additions.
6 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
2 more added to queues today.
1 awaiting UPC installation.
4 we haven't heard from yet.
- Well, the usual Monday degradation in the network
happened on Tuesday this week. Must be due to
the upcoming holidays. We have tracked down
several of the problems that arose and have found
ways around them.
- Additional sites contact us every few days to be
added to the implmentation queues.
- There are now nearly sixty separate machines running
LDM software as part of the IDD. In several cases,
there are multiple machines at individual sites, but
that still is a substantial number up and running.
NCB Day minus 8
Status Green: Tuesday, November 22, 1994.
IDD Deployment Site Status Summary:
16 reporting statistics.
3 running, but NOT delivering statistics.
18 working on LDM/IDD or Mctingest implementation, including
two recent additions.
6 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
1 awaiting UPC installation.
4 we haven't heard from yet.
- Yesterday was the best Monday for IDD reliability
in some time--in spite of problems with the relay
at Texas A&M.
- We are working with LSU to resolve degraded
mid-day delivery to their site.
NCB Day minus 9
Status Green: Monday, November 21, 1994.
IDD Deployment Site Status Summary:
16 reporting statistics.
3 running, but NOT delivering statistics.
16 working on LDM/IDD or Mctingest implementation.
6 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
1 awaiting UPC installation.
2 added to queues today.
4 we haven't heard from yet.
- Data delivery continued reliably after problems
at Washington relay were fixed.
- Due to Thanksgiving holidays, we really only
have about a week to get everyone running
before the Alden C-band transmission stops.
Please contact support and let us know where
you stand with the implementation.
- Two additonal sites, Calvin College and Oklahoma
State, have sent in their IDD applications and
have been added to the queues.
NCB Day minus 10
Status Green: Sunday, November 20, 1994.
IDD Deployment Site Status Summary:
16 reporting statistics.
3 running, but NOT delivering statistics.
16 working on LDM/IDD or Mctingest implementation.
4 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
2 awaiting UPC installation.
5 we haven't yet contacted.
- Apparent problems at the Northwest relay caused
poor data delivery to most western sites.
The problem was corrected at about 21 GMT.
NCB Day minus 11
Status Green: Saturday, November 19, 1994.
IDD Deployment Site Status Summary:
16 reporting statistics.
3 running, but NOT delivering statistics.
16 working on LDM/IDD or Mctingest implementation.
4 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
2 awaiting UPC installation.
5 we haven't yet contacted.
- Overall the IDD system seems to have stabelized. Not many
new sites have appeared in the statistics reports. At the UPC,
we plan to contact all the sites we haven't heard from yet
as well as those who have indicated they are working on
the LDM/IDD implementation, but haven't appeared in the
logs.
- The statistics reports seem to indicate there are a number of
recurring network problems. We are addressing the worst of
those which is the New York State area. It appears that
changing the routing will alleviate that problem. As time
permits, the UPC staff will work with sites in other areas
to find solutions to their problems. In the meantime
our highest priority is to get all the sites who applied
for the IDD up and running with the LDM 4.1 before
the Alden C Band transmission ceases on December 1.
ran reliably yesterday and overnight.
NCB Day minus 12
Status Green: Friday, November 18, 1994.
IDD Deployment Site Status Summary:
16 reporting statistics.
3 running, but NOT delivering statistics.
16 working on LDM/IDD or Mctingest implementation.
4 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
2 awaiting UPC installation.
5 we haven't yet contacted.
- Robb has made available to the IDD sites
a package for monitoring the local product
capture in much more detail than the previous
mechanisms.
- The sustitute relay machine at Illinois
ran reliably yesterday and overnight.
NCB Day minus 13
Status Green: Thursday, November 17, 1994.
IDD Deployment Site Status Summary:
16 reporting statistics.
3 running, but NOT delivering statistics.
16 working on LDM/IDD or Mctingest implementation.
4 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
2 awaiting UPC installation.
5 we haven't yet contacted.
- John Kemp found a temporary replacement for
the relay machine that was experiencing
hardware problems at Illinois. The downstream
nodes appear to be receiving data reliably
once again.
NCB Day minus 15
Status Green: Tuesday, November 15, 1994.
IDD Deployment Site Status Summary:
14 reporting statistics.
3 running, but NOT delivering statistics.
18 working on LDM/IDD or Mctingest implementation.
4 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
2 awaiting UPC installation.
5 we haven't yet contacted.
- Hardware problems at Illinois have caused
data delivery failures to sites downstream.
The offending disk system is scheduled
for replacement today.
NCB Day minus 16
Status Green: Monday, November 14, 1994.
IDD Deployment Site Status Summary:
13 reporting statistics.
3 running, but NOT delivering statistics.
19 working on LDM/IDD or Mctingest implementation.
4 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
2 awaiting UPC installation.
5 we haven't yet contacted.
- A graph of the daily statistics summaries
shows that Mondays have been the poorest
days for IDD reliability. Today appears
to have been consistent with that trend.
As yet we haven't come up with an
explanation for this phenomenon.
NCB Day minus 17
Status Green: Sunday, November 13, 1994.
IDD Deployment Site Status Summary:
13 reporting statistics.
3 running, but NOT delivering statistics.
19 working on LDM/IDD or Mctingest implementation.
4 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
2 awaiting UPC installation.
5 we haven't yet contacted.
- The top level relay node at Illinois failed
overnight due to hardware problems on the wx2
machine there. John Kemp has notificed the
other IDD sites that he's got it up and
running again at about 21 GMT.
- Reliability problems continue at the Naval
Postgraduate School and unfortunately the
failure at Illinois interfered with our
experiment in comparing two different
delivery routes into the network-troubled
SUNY sites. Weekdays are a better time for
those comparisons in any event, so Monday's
statistics for redwood and citation will
be interesting.
- The other sites apparently are receiving the data
reliably over the weekend.
NCB Day minus 18
Status Green: Saturday, November 12, 1994.
IDD Deployment Site Status Summary:
13 reporting statistics.
3 running, but NOT delivering statistics.
19 working on LDM/IDD or Mctingest implementation.
4 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
2 awaiting UPC installation.
5 we haven't yet contacted.
- We've now got about one third of the sites
in the deployment queues up and running.
- Experiments yesterday feeding SUNY
Albany from Cornell instead of directly from
Alden are promising. Even in the face of
Friday afternoon network congestion, the
data delivery from Cornell was reliable.
This may be a solution for the known severe
Internet problems in that area.
- We discovered an incorrect recommendation
in the LDM documentation which was leading some
sites to put in crontab entries which mailed in
statistics reports at 5 minutes past the hour.
The optimum time for mailing in the reports is
35 minutes after the hour.
NCB Day minus 20
Status Green: Thursday, November 10, 1994.
After a round of phone calls to the queued sites by Linda Miller:
11 reporting statistics.
3 running, but NOT delivering statistics.
20 working on LDM/IDD or Mctingest implementation.
4 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
2 awaiting UPC installation.
6 we haven't yet contacted.
- As you can see in the list above, Linda Miller's
phone calls have provided a more complete picture of
where the queued sites stand regarding IDD
implementation.
- Miami network problems did not resurface during
the day yesterday.
- Network outages on the Rutgers campus resulted in
lost data there yesterday. Overnight, the data
delivery was fine. We'll see what happens during
the crunch today.
- Data into the new IDD site at Cornell gives us hope
for alternate routing into the New York state region
to avoid the reliability problems at the SUNY sites.
- Message from Alden: The ldm process group was
restarted on lightning when the hrs ingester died.
There will be no hrs data between 22:13 EST on Nov
9th and 00:15 on Nov
10th.
NCB Day minus 21
Status Green: Wednesday, November 9, 1994.
After a round of phone calls to the queued sites by Linda Miller:
8 reporting statistics.
3 running, but NOT delivering statistics.
23 working on LDM/IDD or Mctingest implementation.
4 postponing IDD, getting data via Ku.
1 postponing IDD, using IEIS servers until LDM hardware arrives.
2 awaiting UPC installation.
6 we haven't yet contacted.
- A National Weather Service computer problem caused a 30 minute
outage of PPS, IDS, and HRS data at about 18 GMT.
- Statistics reporting failed at Alden site
around 07 GMT this morning. A random check
of the products being delivered indicates
that the data were still being delivered
reliably.
10:10 AM MST: Alden restarted the statistics
program, but, as a result all the percentages
for last night's reliability statistics are
now incorrect. This problem should sort
itself out in the statistics reports starting
at 16 GMT.
- It appears Angel Li at Miami tracked down the
network problems in that region and the data
are flowing reliably into their LDM this
morning.
- There are still quite a few sites we haven't heard from
in the deployment queues. We're contacting them
individually as soon as possible.
NCB Day minus 22
Status Green: Tuesday, November 8, 1994.
Entering our second full week of IDD deployment:
7 new sites reporting statistics.
16 new sites still working on LDM/IDD or Mctingest implementation.
2 postponing IDD because they are getting data via Ku.
1 postponing IDD due to late delivery of needed hardware.
2 awaiting UPC installation.
20 new sites unaccounted for.
- Not much new to report this morning. Sites that
were running are still running. Others continue
with implementation of LDM and statistics
gathering system.
- A special effort is needed to track down problems
getting data to Miami. It's the only site where
problems persist overnight and over weekends.
4:00 PM MDT update. It appears that Angel Li
has localized the problem to a SURANet T1
connection and they are working on it. The
transmission improved for a couple hours
then degraded again.
- It's good to see two alma mater institutions appear
in the statistics: Colorado and now Yale. (Please
forgive me but one has to do what it takes to add
at least a minimal personal touch to what otherwise
might be a very dry and ....)
- University of Virginia will not receive the workstation
on which they were to run the LDM until later in the
year. In the meantime, they are using processed data
products available on IEIS servers around the
Unidata community.
NCB Day minus 23
Status Green: Monday, November 7, 1994.
23 New Sites In Progress. 2 Postponing IDD for Ku.
- Overall good weekend performance.. The exceptions
are ongoing difficulties at Miami and the Naval
Postgraduate School. Intermittent poor reliability
into San Francisco State appears to have come up
again, but the network connections there seem
solid.
- Additional site working on IDD implementation:
NCB Day minus 24
Status Green: Sunday, November 6, 1994.
22 New Sites In Progress. 2 Postponing IDD for Ku.
- Overall good performance continues. Problems
continue at Miami. Naval Postgraduate School
stats show intermittent difficulties. Alden
stats reporting failed for several hours
but data continued to flow. Illinois
reported problems at 6 GMT and missed statistics
repot at 4 GMT, but downstream delivery seems
consistently reliable.
- Two sites in the deployment queues have indicated
they are postponing IDD implementation because
they are getting Ku-band delivery this month.
That reduces the implementation pressure somewhat.
The two sites delaying IDD implementation are:
NCB Day minus 25
Status Green: Saturday, November 5, 1994.
First Deployment Week: 22 New Sites In Progress.
- After one partial (started Tuesday) week of deployment,
we've heard from nearly half the sites in the
deployment queues who are at various stages of
implementing the latest LDM at becoming part of
the IDD. So far things seem to be going
well. We continue to be amazed at the wide variety
of problems that can come up, but, so far, we have
been able to solve most of them and have identified
courses of action that will take care of the others.
In other words, there has been very significant
progress and several new sites are up and running, but
there is a lot of work left to be done. So far
we haven't run into any "showstoppers."
- Keep an eye on the
Hourly IDD Performance Statistics Summary.
to watch for new sites as they come online.
- The IDD statistics at Miami are poor--especially for a
weekend. Simple "ping" tests indicate their may
be problems with the network in that region.
NCB Day minus 26
Status Green: Friday, November 4, 1994.
First Deployment Week: 22 New Sites In Progress.
- Oregon State gets the prize for being
the first new IDD deployment site to
report data reception statistics.
Way to go, Dean!!!!
- Saint Louis U. was a close second.
To paraphrase the words of the famous Dr. Peggy Bruehl,
Gold (or is it silver for second place?) stars
to Eric, Jim, and the crew at SLU.
- We now have heard from 22 sites from the
deployment queues who are
implementing the LDM 4.1 or connecting to
a Mctingest feed (OS/2 only sites) and becoming
part of the IDD. The additional sites we've interacted
with today are:
- University of Denver (OS/2 only, is RUNNING)
- University of Colorado (CNS)
- Cornell University
- Millersville University
- LSU
- Jackson State (OS/2 only)
- University of Missouri-Columbia (OS/2 only)
- Central Michigan (OS/2 only)
- Mississippi State (OS/2 only)
- New Mexico State
- Auburn University
- University of British Columbia
- University of Sakatchewan
- Southwest Missouri State
- SUNY Brockport overnight statistics
reporting more than 100% of products.
The symptom is similar to that seen
at MIT earlier.
- Ben is revisiting the Internet problem in the
New York State region with ANS and NYSERnet.
NCB Day minus 27
Status Green: Thursday, November 3, 1994.
Deployment Continues.
- The feeds from Alden became unstable
during the hour 1994110222 last
night. All relays except Illinois
recovered during the next hour.
Illinois no longer captured the FOS
data after that until it resumed
at 1994110303. John Kemp followed up
with email explaining this one.
- Additional sites from the deployment queues are
implementing the LDM 4.1 and becoming part of the IDD:
- SUNY Oswego
- Oregon State University reports getting data.
NCB Day minus 28
Status Green: Wednesday, November 2, 1994.
Full Deployment Is Underway.
- Several sites from the deployment queues are
already implementing the
LDM 4.1 and becoming part of the IDD:
- University of Colorado (CIRES)
- McGill University
- Utah University
- Yale University
- University of Nebraska
- Saint Louis University
- Oregon State University
- Steve Finley from CSU is putting his Ku-Band
reception system in place and will work with
us on establishing it as an alternate source
site for the FOS (Family of Services)
in the event that Alden can't
get the data out from their site via the IDD
(e.g. network failures in their region).
- McIDAS source failure reported yesterday was
actually scheduled downtime.
- Performance continues to be quite good.
- Some anomalies showed up in the statistics
reports Monday afternoon. Top level relay
sites showed some data missing while the
sites below them were receiving 100% of
products. This occurred during single
hours for a few of the relays and is being
investigated.
NCB Day minus 29
Status Green: Tuesday, November 1, 1994.
Full Deployment Starts Today.
- McIDAS source feed failed around 06 GMT last night.
- Monday IDD performance was very good.
- Full Deployment Starts Today. If
you're in the deployment queue, please try to
get the system running as quickly as possible,
so we don't end up with everyone trying to
get started at the end of the month. Thanks.
NCB Day minus 30
Status Green: Monday, October 31, 1994
- Weekend performance was very encouraging.
- Tomorrow, November 1, 1994 still looks good
for starting full deployment.
- Binary distributions of the LDM 4.1.36
are being built at the UPC
for tomorrow's general release.
- The IDD system with all relays and many
leaf nodes upgraded to
LDM 4.1.36 has held up well through the
Monday network congestion so far (through
5 PM on the East Coast).
NCB Day minus 31
Status Green: October 30, 1994
- Saturday IDD performance was very good. Full
deployment begins on Tuesday. Hopefully all
the test sites can be converted to LDM 4.1.36
(except for Ultrix and LINUX machines) on
Monday. That would allow the UPC staff
to focus on getting the new sites going
rather than on converting the test sites.
- HRS feed appears to have died at Alden around 1600 GMT.
No word from Alden operations as to what might
be happening.
- From the statistics, it looks like Alden got the
HRS running again by restarting the LDM or
maybe the computer itself. In any case, two
hours of HRS were missed and a few product
were lost during the restart, but the
daily summary shows 99.7% of the products
sent from Alden were delivered to the test
sites succesfully.
NCB Day minus 32
Status Green: October 29, 1994
- Overnight statistics look good. While the
weekend performance isn't a good indicator
of how the system will react during network
congestion, it does show that no
fundamental problems were introduced with
LDM 4.1.36.
- Checks of logs at several source and relay nodes
revealed no LDM breakdowns during the Friday
afternoon Internet crunch.
- Florida State seems to be experiencing duplicate
product retransmissions (The symptom is hourly statistics
showing more than 100%). This is not a major problem,
but it's worth checking into becauses it's unusual for
sites running LDM 4.1.
- Please note that nothing has been mentioned about
college football games in this log. That shows
admirable restraint with the "game of the millenium"
at hand.
- 1:30 PM MDT:The statistics coming out of the Lincoln, Nebraska
site are not good at all.
- November 1, 1994 still looks good for starting deployment. We don't
have much time before the Alden C-Band broadcast is shut down.
- Be aware that Daylight Savings Time ends in many
regions tonight. These changes can cause crontab
entries to fail on some systems.
NCB Day minus 33
Status Unknown: October 28, 1994; 8:00 AM
Status Yellow-Green: October 28, 1994; 9:45 AM
Status Green: October 28, 1994; 4:55 PM
- IDD performance statistics inaccessible because
of file system changes on UPC network last night.
Performance of IDD won't be known until file
system problems are fully corrected and IDD statistics
processing is resumed. Both these things are
being worked on right now.
- 8:00 AM MDT: The file system difficulties also made it impossible
to reach the UPC gopher and www servers last night.
FTP access apparently was working, but the default
directory was different from what was expected.
This should be back in working order now.
- 9:45 AM MDT: Problems with www, gopher, and ftp access should
be fixed now. The IDD statistics processing is
working again. Let us know if you see other
things that aren't working yet.
- Note that may of the test sites are upgrading to
LDM 4.1.36, starting last night. As the upgrades
occur, there may be some lost products.
- 4:55 PM MDT: Things are settling in now with 16 test
sites up and running with 4.1.36. Way to go, gang!!!
So far, we haven't seen evidence of new problems
with this release. We'll see how things go over
the weekend.
- Thank you, thank you, thank you to the test sites.
NCB Day minus 34
Status Green: October 27, 1994
- IDD performance statistics looked reasonably good yesterday.
- November 1, 1994 still looks good for starting deployment.
- Testing is underway on the next release of the LDM (4.1.35).
A decision will be made soon to determine whether the new
release is solid enough to be used in production.
- Mitch sent out a note to the deployment sites reminding them to familiarize
themselves with the LDM 4.1
and the pre-installation manual in particular, so they'll be ready to go
when full deployment starts next week.
- A message from WestNet indicates there relay node should be
running later this week.
- A new problem cropped up with the mail delivery for
the statistics reporting system itself. The net result
is that the stats for some sites do not appear in the
summaries even though the sites are reporting. Robb
has identified a fix and is implementing it.
NCB Day minus 35
Status Green: October 26, 1994
- Rats! Alden statistics gathering failed overnight. However,
"manual" comparison of product numbers indicates data
delivery continued without special problems. Note that
when the Alden stats come badk on line, there may
be a jump above 100% for many sites because Alden
will report only part of the products for that hour.
- Mitch and Texas A&M staff found a way around the firewall
problems there, so we have another top level relay node
running. Data delivery appears reliable.
- LSU queue questions resolved. DuPage contacted.
- Note that LSU has agreed to serve as an additional
data recovery site once they get up and running.
- Data reception problems cropped up at U of Washington
during 1994102616
and 1994102617. This may be another case of a process
locking up that Harry has reported on his machine. These
problems are intermittant and often occur in processes
that are not part of the LDM at all.
- November 1, 1994 still looks good for starting deployment.
NCB Day minus 36
Status Green: October 25, 1994
- November 1, 1994 still looks good for starting deployment.
- Alden LDM restart Monday morning caused some
data losses. Most sites still were above
90% during that period.
- Stats problems continue at MIT, NPGS, COMET, Arizona (HRS)
- Georgia Tech FOS reception dropped out.
- NC State problems showing up again. Also check whether
Alabama Huntsville has switched feed away from NC State yet.
- DuPage and LSU have questions about place in queue.
NCB Day minus 37
Status Green: October 24, 1994
- IDD performance over the weekend was excellent.
- Need to check MIT, NPGS, COMET, Arizona (HRS)
- November 1, 1994 still looks good for starting deployment.
- Monday afternoon network congestion will provide
good test of revised routing topology with new
version of LDM at most test sites.
NCB Day minus 38
Status Green: October 23, 1994
- November 1, 1994 still looks good for starting deployment.
- Overnight statistics look very good, but that's
to be expected on Sunday morning when network
traffic has been light.
- Naval Postgraduate School IDD reliability stats seem to be
inaccurate in manner similar to MIT.
NCB Day minus 39
Status Green: October 22, 1994
- Still on track to begin full deployment November 1, 1994.
- Overnight statistics look very good with most sites
converted to LDM 4.1.34 and to the new broader fanout
routing topology.
- Areas that need attention:
- Ben has to come up with
a reliable, well-connected relay site in Northeast.
- MIT (DEC Ultrix) and Rutgers
(LINUX) can't convert to LDM 4.0. MIT statistics
gathering is faulty. Higher priority is to find a
way to upgrade MIT to LDM 4.1 with new stats package.
Rutgers is examining other versions of Unix for
Intel computers; also tracking LINUX updates.
- Arizona intermittently shows a few missing HRS; need to
check with Mike to see if he's requesting everything.
- Found out Texas A&M LDM machine is behind a security
firewall; Mitch is working with them to allow
RPCs on well know port to get through or to move machine
outside firewall.
- True test will come on Monday during heavy network
load. That will provide good baseline for performance
of new topology and with LDM 4.1.34.
NCB Day minus 40
Status Green: October 21, 1994
- Decision holds to begin full deployment November 1, 1994.
- Mctingest experiment worked well feeding several internal
(UPC) and one external OS/2 sites overnight from
an LDM relay node.
- No mechanism for at present for OS/2 only sites to be included in
general performance statistics listing. Stats
addition will be investigated for next phase
which is the OS/2 LDM ingestor module.
- IDD sites ran reliably overnight although Illinois missed
a few products during two hour period. Some stats reports
were incomplete due due source site reporting problems.
- Need to investigate intermittant pqbinstats failure
at Alden causing lost stats while data
seems to keep flowing fine.
- Attempt to fix stats problem at Alden blew out LDM
itself temporarily at about 18 GMT. We're attribtuing
this one to pilot error. --The Pilot
- Mitch and Robb will continue to work with Alden
on improved WARNING and Problem Tracking for Operations
Staff. Parts of system will be usable at other
source sites and at relays.
- Matt will work on better documentation for data
recovery sites (first priority, however, is final version of
LDM Manual).
- Need to find out from Alden whether the NWS headers
will be used for the DIFAX products.
NCB Day minus 41
Status Green: October 20, 1994
- Decision made to begin full deployment November 1, 1994.
- Conversion to fanout of 8 seems to be working well with
LDM 4.1.34.
- Mitch will work out revised routing topology based on
broader fanout.
- SUNY Albany out of relay loop until New York state internet
problems are resolved.
- NC State out of relay loop until OS is upgraded to consistent
state.
- Ben will even more actively pursue a relay at NEARnet for
Northeast area.
- DIFAX feedtype will be included before full deployment release.
- Tom and Steve will begin testing Mctingest as "backend"
to LDM for feeding OS/2 only sites. Immediate priority
is to get sites online using the Mctingest approach, but
next priority is to get LDM ingest module for McIDAS
running on OS/2 (rather than further refinements and
porting of Mctingest).
- Contact established with Guy Almes, VP at ANS--very
helpful both for guidance in NSFnet transition and
in troubleshooting network problems, e.g.,
difficulties at SFSU.
support@unidata.ucar.edu