[jwlong@xxxxxxxxxxxxxxxxxxxx: ]

Hi,

I'm forwarding this summary of the Langley meeting on the "National Grid
Project" that occurred during our workshop, for background information.

--Russ


Introduction:

On May 1, 1992 there was a meeting at NASA Langley to discuss possible
choices of a library and application programming interface for
reading/writing/exchanging scientific data for use in the National Grid
Project at MSU.  In this case, the area of interest was mesh information
and variables on a mesh.  Another purpose of this meeting was to
participate in and contribute to the ongoing efforts to establish a defacto
scientific database library.  The people attending were:

Stewart Brown   LLNL - PDBLib
Linnea Cook     LLNL
Mike Folk       NCSA - netCDF/HDF merge
Adam Gaither    NGP
Chris Houck     NCSA - netCDF/HDF merge
Robert Jackson  NARL
Jeff Long       LLNL - SILO
Michael McLay   NIST
Bob Weston      NASA Langley

I (Linnea Cook) agreed to write up the results of this meeting.  The
results and action items are in the section with that title which follows
later.  I have also included the following background section to define all
the acronyms used and to give some context for the decisions which were
made.  Most of this background information was covered during the Friday
meeting at Langley.

Background:

For some time there has been interest and work in the scientific community
for having a standard library and application programming interface for
reading/ writing/exchanging scientific data.  Three established defacto
standards for this are:

        o       HDF (Hierarchical Data Format) from NCSA (National Center
                for Supercomputer Applications), 

        o       netCDF (network Common Data Form) from Unidata
                Program Center and 

        o       CDF (Common Data Form) from NASA Goddard.  

CDF and netCDF are very similar; in fact, netCDF is a spin off from CDF.  

Several recent important steps have occurred to further the establishment
of a single, more widely used I/O library for scientific data.  NSF has
recently decided to fund NCSA to put the netCDF interface on top of its HDF
library and to convert all of their public domain tools (Image, Layout,
etc.) to use the netCDF interface.  NCSA is in touch with Unidata in
regards to this merge project.  This should help to unify two widely used
standards.

The Earth Observing Systems project (EOS) is expected to receive $3 billion
in funding over the next decade.  The Earth Observing System Data
Information System (EOSDIS) part of the EOS project has selected NCSA's
netCDF/HDF merge product as its scientific data I/O library.  NCSA does not
know how many people currently use HDF.  However, when the latest version
of HDF was released to its users, 2500 people downloaded a copy of this
library within the first month.

As part of this netCDF/HDF merge project, Mike Folk (who is in charge of
HDF and the netCDF/HDF merge project) is also interested in some work which
has been done by Jeff Long at Lawrence Livermore National Laboratory (LLNL)
on the SILO library.  SILO is a library which implements an application
program interface for reading and writing scientific data.  SILO uses the
calling sequence of the netCDF library but has made two extensions to the
netCDF interface - objects and directories.  Objects allow a set of related
variables and other data to be grouped together.  Directories in SILO allow
the user to structure a database into a hierarchy that is analogous to a
UNIX file system.  It is these two extensions to netCDF (objects and
directories) which NCSA is interested in including in its netCDF/HDF merge.

At a meeting between NCSA and LLNL it was determined that the SILO
extensions appear to be very compatible with the NCSA merge of netCDF and
HDF.  NCSA wants to add these extensions and will do so provided their user
community approves and provided they have funding to do this work.  NCSA
estimated that it will take them one month to finish the prototype version
of the netCDF/HDF merge work.  They would want to allow six months to do
the SILO extensions once this work is started.

Russ Rew (who leads the netCDF project at Unidata) and Jeff Long (the
author of SILO) are currently corresponding to refine the SILO extensions
to netCDF.  They hope to agree upon these extensions and cooperate with
NCSA and Unidata so that the same extensions are put into both the
netCDF/HDF merge and into netCDF.

Two other topics were mentioned but not resolved at the NCSA / LLNL
meeting.  These topics were the `standard' definition of some objects and
the use of a socket library interface for reading data across a network.
The SILO object extension (by itself) allows users to define their own
objects but does not assign meaning to the objects.  SLIDE (a companion
library to SILO) has, however, defined objects for mesh data commonly found
in physics simulations.  An example is a `quadmesh' (quadrilateral mesh) -
this object must include the dimension and coordinate data and also
typically includes the mesh's labelling and unit information.  NCSA and
LLNL thought it was desirable to use these mesh object definitions as a
starting point for the scientific community to define `standard' mesh
objects for use in the netCDF/HDF merge product.  However, since NCSA is
not yet funded to do the two primitive extensions (objects and directories)
it is somewhat premature to plan this.

HDF currently has a socket library interface for reading HDF data across a
network.  SILO has a similar interface for reading SILO objects across a
network.  In both cases the code which uses HDF or SILO does not know
whether the data is coming from a disk file or a network connection.  NCSA
and LLNL think it may be possible to combine the two socket libraries some
time in the future since each addresses a different data type but that it
was premature to evaluate this now.

The PDBLib (Portable Database Library) scientific database library is of
considerable interest to the National Grid Project because of its speed and
flexibility.  PDBLib is similar to HDF in that both the library and the
file it produces are portable.  One major difference between PDBLib and HDF
is that PDBLib allows the user to define C-like structures, then read and
write these structures in one operation.  The structures can contain
primitive data, pointers to primitive data, other structures, and pointers
to other structures.  PDBLib also has a more general conversion model - it
can write in native format, then read that on any other machine.  Or, it
can create a file on one machine in any other machine's format.  HDF can
read/write data in a machine's native format but can not move this file to
any other machine which uses a different format.  HDF also can read/write
IEEE format on any machine - this IEEE format file is portable to any
computer.  PDBLib was developed at LLNL by Stewart Brown.  The SILO
interface is currently implemented on top of PDBLib.

Results and Action Items from the May 1 Meeting with the National Grid Project:

The following consensus was achieved during this meeting:

        1. The National Grid Project would like to use the netCDF/HDF merge
product being developed at NCSA and is also interested in the SILO
extensions and SLIDE object definitions.

        2. The attendees agreed to send Mike Folk requirements for HDF -
some of this will be based on experience with the current HDF library.
Also, there was concern that there is functionality in PDBLib which is not
in HDF.  Stewart Brown will be sending out the PDBLib manual to interested
parties.  Those not familiar with PDBLib said they would examine its
functionality and provide Mike with feedback on what they saw as additional
capabilities which they would like from the PDBLib capabilities.  All of
this should be written input.

        3. Since NCSA is interested in adding the SILO extensions to their
netCDF/HDF merge product, the people at this meeting agreed to examine the
SILO extensions and get back to Mike Folk with their feedback on these
extensions (opinions, changes, additions).  These two SILO extensions are
the directory and object primitives.  This should be written input.

        4. Mike Folk will send the SILO extensions out to his user
community for feedback.

        5. Part of the SILO document includes the definition of higher
level mesh objects on top of the netCDF interface (with the two SILO
primitives).  We agreed to look at these objects as a possible starting
point for defining 'standard' mesh objects for use across many sites.

        6. Mike Folk agreed to send us copies of all written input sent to
him by this group to us (with possibly some editing).  He also agreed to
put together a list of additional requirements for the netCDF/HDF merge
product and send them to us.

        7. The next step which was mentioned was that NCSA would need
funding to do additional work beyond the netCDF/HDF merge.

        8. The netCDF/HDF merge product will be available as a beta product
by the end of July and as a fully released product by May 1993.  The
National Grid Project will need a library sooner than that.  The
SILO/netCDF interface on top of PDBLib may be used as an interim solution
allowing the netCDF/HDF merge library to replace it with little or no
change to the application programming interface.

Other Information:

One desire expressed at this meeting was for a way of writing data to disk
without going through the translation to IEEE floating point format and
being able to read this data and translate it later, if necessary.

Another desire was to be able to write a code's internal data structures
directly to disk.  Some subsequent discussion indicated that being able to
write data quickly and with little overhead (little extra information
written to disk) was the basic requirement.  Another part of this
requirement seems to be the ability to write any data to disk without first
getting an 'approved' tag or data type implemented.  This was for use
during the development stage.  All agreed that eventually the tags would be
officially requested, granted and documented.  Since the issue of writing a
code's internal data structures directly to disk received considerable
comment, we should specifically address this in our feedback to Mike Folk.
Related to this is the question of whether it is important to be able to
use other tools (such as graphics codes) to read (and display or do other
operations on) this data?

Adam Gaither from NGP volunteered to be a contact point for disseminating
information to this group.  NGP is another good forum for pushing a
standard scientific database library and Adam wants NGP to help with and
participate in this process.

Michael McKay from NIST talked about several other related standards in the
scientific community (STEP, IGES, OMG, OSI).  He will be forwarding
information on relevant standards to the attendees.

Unless requested otherwise, all attendees were added to NCSA's mailing list
for the discussion of the netCDF/HDF merge project.  Send mail to
hdf-netcdf- request@xxxxxxxxxxxxx if you wish to be removed.

If you have any questions, corrections or additions, please call me at the
phone listed below.  Or you can send email to me though Jeff Long.  I will
send out my own email address as soon as I have one.  My name and address
were left off the list which Adam Gaither sent out, so if you would please
add my name and address to your copy of the list I would appreciate it.

--------------------------------------
Linnea Cook
Lawrence Livermore National Laboratory
B Group Leader
P. O. Box 808,   L-35
Livermore, California   94550

Desk:   510-422-1686
FAX:    510-422-3389
E-Mail: (you can reach me through Jeff Long's  E-Mail) jwlong@xxxxxxxx