Real-time, self-managing data flows -- Unidata will foster and support the existence of real-time data flows that encompass a broad range of Earth-system phenomena, can be accessed with ease by all constituents, and are self managing in respect to changing contents and user needs.
--A goal of Unidata 2008: Shaping the Future of Data Use in the Geosciences
Highlights of the 6.4.x releases:
The toplevel IDD relay node operated by the UPC, idd.unidata.ucar.edu, is now the cluster that was reported on in the Spring 2005 status report.
Live stress testing of the cluster demonstrated that the limiting factor for data relay was the local network bandwidth which is 1 Gbps at UCAR! The cluster was operated at a sustained 500 Mbps output for a three day period in June. Tests were limited to this level since we ran out of downstream machines to which we could send data. During the stress testing, the cluster data backend machines were essentially idling.
The developers involved in the cluster effort are:
|John Stokes||cluster design and implementation|
|Steve Emmerson||LDM-6 development|
|Mike Schmidt||cluster design and system administration|
|Steve Chiswell||IDD design and monitoring|
|Tom Yoksas||configuration and stress testing|
The design of our toplevel IDD relay node, idd.unidata.ucar.edu. is briefly described below (updated since the Spring 2005 status report).
In addition to atm.geo.nsf.gov the UPC operates the top level IDD relay node idd.unidata.ucar.edu. Instead of idd.unidata being a simple machine, it is part of a cluster that is composed of directors (machines that direct IDD feed requests to other machines) and data servers backends (machines that are fed requests by the director and service those requests). We are using the IP Virtual Server (IPVS) available in current versions of Linux to forward feed requests from directors to data servers.
Our cluster data backends currently run Fedora Core 3 64-bit Linux on identically configured Sun SunFire V20Z 1U rackmount servers:
Sun SunFire V20Z configuration
The cluster director is currently a Dell PowerEdge 2850 rackmount server. The 2850 is configured as follows:
Dell PowerEdge 2850 configuration
The SunFire V20Z machines have proved to be stellar performers for IDD relay when running Fedora Core 3 64-bit Linux. We tested three operating systems side-by-side before settling on FC3:
All three operating systems are 64-bit. In our testing FC3 emerged as the clear winner; FreeBSD was second; and Solaris x86 10 was a distant third (this was very surprising). RedHat Enterprise WS 4 is FC3 with full RH support.
We will be testing the latest Fedora Core Linux release, version 4, for data backend use in the near future.
The following is a schematic view of what idd.unidata.ucar.edu:
|<-- director(s) -->| +-------+ | ^ V | +---------------+ idd.unidata.ucar.edu -> | LDM | IPVS | +---------------+ / | \ / | \ / | \ / | \ / | \ / | \ / | \ / | \ +---------------+ +---------------+ +---------------+ | 'uni1' LDM | | 'uni2' LDM | | 'uni4' LDM | +---------------+ +---------------+ +---------------+ uni1.unidata.ucar.edu uni2.unidata.ucar.edu uni4.unidata.ucar.edu |<---------------------- data servers ---------------------->|
The top level indicates one director machine: idd.unidata.ucar.edu This machine is running IPVS and LDM 6.3.0 configured on a second interface (IP). The IPVS director software forwards port 388 requests received on a one interface configured as idd.unidata.ucar.edu on one machine and thelma.ucar.edu on the other. The set of data server backends are the same for both directors (at present).
When an IDD feed request is received by idd.unidata.ucar.edu is relayed by the IPVS software to one of the data servers. Those machines are configured to also be known internally as idd.unidata.ucar.edu, but they do not ARP, so they are not seen by the outside world/routers. The IPVS software keeps track of how many connections are on each of the data servers and forwards ("load levels") based on connection numbers (we will be changing this metric as we learn more about the setup). The data servers are all configured identically: same RAM, same LDM queue size (8 GB currently), same ldmd.conf contents, etc.
All connections from a downstream machine will always be sent to the same data server as long as its last connection did not die more than one minute in the past. This allows downstream LDMs to send an "are you alive" query to a server that they have not received data from in awhile. Once there have been no IDD request connections by a downstream host for one minute, a new request will be forwarded to the data server that is least loaded.
The design of the cluster allows for service on any data server without service interruption. When a data server goes out of service, the IPVS server is informed that the server is no longer available, and all downstream feed requests are sent to the other data servers that remain up.
LDM 6.3.0 was developed to allow running the LDM on a particular interface (IP). We are using this feature to run an LDM on the same box that is running the IPVS director. The IPVS listens on one interface (IP) and the LDM runs on another. The alternate interface does not need to represent a different Ethernet device; it can be a virtual interface configured in software. The ability to run LDMs on specific interfaces (IPs) allows us to run LDMs as either data collectors or as additional data servers on the same box running the director. (A data collector is an LDM that has multiple ldmd.conf requests that bring data to the cluster (e.g., CONDUIT from atm, UIUC, and/or, NEXRAD2 from Purdue, HDS from here, IDS|DDPLUS from there, etc.)). The data server LDMs request data redundantly from data collector LDMs. There is currently no directory redundancy; that will be added in the future.
The cluster setup is still new. Configurations will be modified as more is learned about how well the system performs. Stress tests run at the UPC demonstrated that one SunFire V20Z was able to handle 50% more downstream connections than the old SunFire 480R thelma.ucar.edu without introducing latency. With three data servers it is believed that the cluster can field literally every IDD feed request in the world if needed making the cluster the ultimate failover site. If the load on existing data servers ever becomes too high, more can easily be added. The ultimate limiting factor in this setup will be the routers and network bandwidth here in UCAR.
This cluster current relays an average of 140 Mbps (~1.4 TB/day) to approximately 250 downsteam connections. Peak rates routinely exceed 260 Mbps (2.6 TB/day).
Steve Emmerson and Anne Wilson have been tasked with exploring the form and features of an ideal data relay system. Although still in draft form, an internal white paper has evolved to survey possible benefits of other relevant technologies, as well as to make recommendations regarding where Unidata should be in five years with respect to data delivery. The paper also includes some plans for describing how the IDD can be transitioned from one protocol to another.
NB: In order to correctly gauge real-time status of the IDD, it is important that all participating sites accurately maintain their system clocks. This is easily done through use of a Network Time Protocol daemon run on the local machine.