IDD Operational Site Requirements

Mitch Baltuch

June 6, 1995


Overview:

This document is a guide for Unidata member sites that will be implementing the Internet Data Distribution (IDD) system, whether as source, relay, or leaf nodes. It discusses what will be needed to implement and operate an IDD connection.

The document is broken down by types requirements:

These sections are further divided into:

The document ends with a section detailing some of the skills needed by a site administrator to install, tune, and maintain the IDD.

Network Requirements:

The IDD will represent a significant addition to a site's current bandwidth requirements, regardless of what type of involvement is being considered (source, relay, or leaf). Sites trying to determine what size Internet pipe is needed must first understand their existing network loads.

Data Source Nodes - A site planning to inject data into the IDD system will need available bandwidth equal to n times the average throughput of the data stream being distributed, where n is equal to the number of downstream nodes being fed by the site. This may be in addition to any necessary bandwidth to receive other data streams for which the site may be a leaf node.

Relay Nodes - A site planning to relay data to others will need available bandwidth equal to n+1 times the average throughput of the data stream(s) being relayed, where n is equal to the number of downstream nodes being fed. Again, this may be in addition to the bandwidth necessary to receive other data streams for which it may be a leaf node. At current data rates, a site relaying both the FOS and Unidata/Wisconsin streams will need a bandwidth roughly equal to 20Kb/sec. per downstream node plus itself, 94Kb/sec at the most.

Leaf Nodes - The bandwidth needed to only receive data is equal to the average throughput of the data stream(s) being received. At current data rates that will be about 20Kb/sec, 94Kb/sec at most.

Hardware:

It is impossible to specify what hardware configuration will work for all nodes in a given class--the load on the hardware from other programs and uses varies too widely. Therefore, the information provided below is only a guideline.

Regardless of what class of site, a Unix workstation that is equivalent in power to a Sun Microsystems SparcStation 2, with a minimum of 32MB of RAM, 64MB of swap space and sufficient local disk to hold a 50MB product queue is requires. It is preferable that the RAM and swap space be increased to 64MB and 165MB respectively.

If the workstation is going to be used for other tasks, in addition to being the IDD host machine, these requirements will increase depending on the extra load imposed by the other applications.

Software:

Software includes not only the LDM4 software, but operating systems and compilers as well. Because the LDM is run as a group of daemons, user interface software, such as windowing systems, is not an issue. In general, software requirements are the same regardless of what class of IDD node is involved.

LDM Software - To make things easier for LDM users, the UPC releases the LDM both as source code and as a binary distribution. For sites that do not wish to go through the trouble of building the LDM, the binary distribution for any of (currently) 6 platforms may be installed. It is essential that data source nodes and relay nodes maintain their LDM software at the most current production release level.

Operating Systems - The UPC develops the LDM on the most current release of the operating systems that it supports. This means that sites who use the binary distribution of the LDM can only be assured that it will work if they also maintain a current OS release. Sites that do not do this will probably find that they have to build the LDM from source code.

Compilers - Sites that wish to build the LDM from source code must have an ANSI C compiler. Manufacturers usually make a range of compilers for their platforms, but the one that comes with your machine may not be ANSI-C. For example, Sun's default compiler for SunOS 4.1.3 is non-ANSI. In this case, the UPC uses Sun's unbundled compiler.

Personnel:

A great deal of the information in this section comes from comments by the site administrators at the current IDD test sites. They were asked to estimate, based on their experience with the IDD, the time that was needed to maintain the LDM and their connection to the IDD. Two issues have come to light as a result of their responses.

First is the issue of technical knowledge. Sites with experienced UNIX computer administrators needed far less time to maintain the LDM/IDD setup than did those with less experienced administrators. Second, sites using their computers for more than just the IDD reported needing less time than those with dedicated IDD machines. We suspect that this is because the multi-use sites do not consider the UNIX system administration time they spend as part of the IDD effort.

Personnel time needed for the LDM/IDD effort varies depending on what tasks need to be done. To build, install, configure, and tune the LDM for the first time can take from 1 day to a couple of weeks, depending on the abilities of the site administrator. We have found that getting a novice site with little or no UNIX experience can be quite a challenge. Once the system is operational, the time need is greatly decreased, but is dependent on the reliability of the site's network connection and its hardware. In general, direct IDD time is about 2-4 hours a week. For sites that only use their systems for the IDD, UNIX system administration time must be figured in, which can be from 2 hours a day to a full time job, depending on how many computers are used, what they are used for, and the experience level of the site administrator.

Data Source Nodes - Site administrators for these nodes need to have a high level of UNIX and networking expertise. This is critical to maintain a reliable flow of data. The site administrator must be able to effectively and quickly troubleshoot a variety of potential problems ranging from software bugs to operating system problems and network outages. Because these sites have to be responsive to their downstream nodes, the time required is increased due to the increased interaction with others and the corresponding increase in problems that have to be dealt with.

Relay Nodes - The system administration requirements for relay nodes are the same as for the data source nodes. Because they are feeding downstream nodes, they are the logical point of contact for any IDD problems the downstream sites experience with IDD connectivity.

Leaf Nodes - The systems administration requirements are much less stringent for these nodes. Problems that might occur at these sites, or with their network connectivity, only effect them and no other sites. Also, since the only data flowing into their site is issued by the IDD, the time spent on monitoring the system is correspondingly less.

Site Administration Skills:

This section contains a discussion of the types of skills a site administrator will need in order to install, tune, and maintain an IDD site. It is broken down into 3 skill areas:

UNIX system administration - First and foremost, an IDD site administrator needs basic knowledge of UNIX system administration. This includes knowing:

All of these skills are directly related to the LDM4 software. In addition to the above, if the site administrator is also the UNIX system administrator, then that person must also have knowledge of the operating system and how to maintain, tune, and upgrade it.

Without these skills, the task of working with the LDM4 is greatly complicated. Most of these skills can be learned by reading the system manuals, as well as available books on UNIX system administration. In addition, courses for novices are offered both by computer manufacturers and by private training organizations.

Network skills - A basic understanding of TCP/IP protocols and Remote Procedure Calls (RPC's) is necessary to properly install the LDM4 and to troubleshoot problems. Beyond that, for source and relay nodes, a fuller understanding of the Internet, its protocols and underlying infrastructure, is critical. These sites need to be able to monitor the status of their network connection and the connection to their downstream nodes. A familiarity with utilities such as netstat, ping, and rpcinfo is needed to help diagnose network and RPC problems.

Many of these skills can be found in the system manuals that come with a sites computer. In addition, there are many books on the subject, as well as a number of courses offered both by computer manufacturers and private training organizations.

Software skills - Whatever the other skills needed by an IDD administrator at a particular site, all sites must have the ability to handle the LDM4. This is the software package that makes the IDD work. While most sites will have the option of installing binary releases, some may opt to build the LDM4 from source code, either because they need specific customizations or they are using a computing platform that is not supported by the UPC.

Certain software skills will be needed to build the LDM4 from source:

These skills can be obtained from the manuals that come with the C compiler, as well as many books and classes that are available.

And, of course, the site administrator needs to know how to build and maintain the LDM4. Full build and installation instructions are included with both the binary and source distributions.