NetCDF Status

Russ Rew
September 19, 2003

NetCDF and Unidata

Work in maintaining, supporting, and developing the netCDF data model and software is associated with Endeavor 6: Improved scientific data access infrastructure from the Unidata 2008 proposal. NetCDF has become a key infrastructure element for producers and consumers of atmospheric science data, as well as data in other geosciences.

Recent netCDF development, both at Unidata and at other institutions, aims at improving interoperability with other representations for scientific data, making the netCDF interface more suitable for use on high-end parallel platforms with high-resolution models, and providing netCDF software on a wider range of desktop platforms.

NetCDF/HDF5 Merger

After receiving the award letter in May for the Unidata/NCSA proposal Merging the NetCDF and HDF5 Libraries to Achieve Gains in Performance and Interoperability, we began working with the NASA-AIST program and developers at NCSA to carry out the two-year development project. We began recruiting staff and corresponding on HDF5 format details for a Java prototype. We also set up a netcdf-hdf mailing list for developers, and discussed implications of new parallel netCDF work from an Argonne/Northwestern research group. Although we requested a delayed start date until we could hire needed staff, NASA determined that the project start date relative to which all milestones are set would be May 15, 2003.

In June, we reviewed 40 applications, conducted 11 phone interviews and 6 in-person interviews, and hired Ed Hartnett as software engineer and Robert Lee as student assistant. Robert McGrath of NCSA presented a project briefing to NCSA developers. Russ Rew and Kent Yang (NCSA) met for project discussions before a WRF workshop. Russ prepared an initial project chart and technology research level assessment for the NASA/ESTO project administration site.

In July, Unidata held a Unidata project kickoff meeting and new Unidata staff began work on the project. Ed created a netCDF-4 web site and requirements list, leading to email discussion and refinements. We discussed a draft netCDF to HDF5 mapping during a joint teleconference with NCSA, and Unidata developed a prototype to demonstrate a proposed approach to backward compatibility.

In August, we completed the file open/create/close API and Ed proposed an architecture for netCDF-4, mapping shared dimensions, error handling, grouping, attributes, and coordinate variables to HDF5 objects. A version-controlled source repository was created and we began development.

In September, Russ and Ed attended an HDF workshop in Silver Springs, MD, to quickly spin up on advanced HDF5 topics and to present a poster session on the the project plans and status. A new student assistant is also being recruited and hired for development at the UPC.

Parallel netCDF

In July, a group of researchers at Northwestern University and Argonne National Laboratory (Jianwei Li, Wei-keng Liao, Alok Choudhary, Robert Ross, Rajeev Thakur, William Gropp, and Rob Latham) announced a preliminary version of a parallel interface for writing and reading netCDF data, tailored for use on high performance platforms with parallel I/O. The implementation builds on the MPI-IO interface, providing portability to most platforms in use and allowing users to leverage the many optimizations built into MPI-IO implementations. We helped to publicize the software by announcing its availability on our portal and netCDF web site, resulting in immediate assistance testing the software on a variety of new parallel platforms. A technical report describes impressive performance gains in using the software, even in comparison with parallel HDF5:

NetCDF-3, NcML, and netCDF for Java development

A new release of the NcML Coordinate Systems XML schema was announced for community input and review. NcML is an XML representation of netCDF metadata, (roughly) the header information one gets from a netCDF file with the "ncdump -h" command. NcML is similar to the netCDF CDL (network Common data form Description Language), except, of course, it uses XML syntax.

Version 2.1 of netCDF for Java was released in June. This new release included new NcML support:

In addition, version 2.1 significantly improved the ability to allow full access to OpenDAP datasets though an extended netCDF API. Caching strategies were also added to improve performance.

Three new beta releases of netCDF version 3.5.1 were announced. Among other fixes, these addressed a significant performance problem first noticed on NEC SX6 platform, which turned out not to be platform-specific.