NetCDF

Status Report: October 2011 - March 2012

Russ Rew, Dennis Heimbigner, Ward Fisher

Strategic Focus Areas

The netCDF group's work supports the following goals from the new Unidata strategic plan:

  1. Enable widespread, efficient access to geoscience data
    by developing netCDF and related innovative cyberinfrastructure solutions to facilitate local and remote access to scientific data.
  2. Develop and provide open-source tools for effective use of geoscience data
    by supporting the use of netCDF and related technologies for analyzing, integrating, and visualizing multidimensional geoscience data; enabling visualization and effective use of very large data sets; and accessing, managing, and sharing collections of heterogeneous data from diverse sources.
  3. Provide cyberinfrastructure leadership in data discovery, access, and use
    by developing useful data models, frameworks, and protocols for geoscience data; advancing geoscience data and metadata standards and conventions; and evaluating emerging cyberinfrastructure trends and technologies, providing information and guidance to community members.
  4. Build, support, and advocate for the diverse geoscience community
    by providing expertise in designing and implementing effective data management, conducting annual training workshops, responding to support questions, maintaining comprehensive documentation, maintaining example programs and files, and keeping online FAQs, best practices, and the netCDF web site up to date; fostering interactions between community members; and presenting community perspectives at scientific meetings, conferences, and other venues.

Activities Since the Last Status Report

Personnel Changes

In October 2011, Ed Hartnett left Unidata for a position at the University of Colorado's Laboratory for Atmospheric and Space Physics (LASP). During the last eight years, Ed was largely responsible for the development of netCDF-4, libCF, and GRIDSPEC.

We recently hired Ward Fisher to replace Ed in the netCDF group, bringing general software engineering and computer science skills, as well as special expertise in Windows interoperability and scientific data compression. Ward's initial focus will be helping to improve netCDF support for Windows platforms. We hope to report success in making netCDF more useful to Windows users and developers over the next few months.

Project and Issue Tracking

The netCDF C-based project continues to use the Jira project tracker tool to manage bug reports, track issues, plan releases, and make our development process more transparent to users. Between September 2011 and March 2012, we created over 40 issues, resolved over 40 issues, and currently have 55 open issues remaining. (Note: Jira issues vary greatly in size and effort required to resolve, so number of issues is not a useful measure of amount of work to do.)

Releases

Version 4.2 of the C-based netCDF software was released in March, 2012, the first new release in 10 months. This release included important performance enhancements, bug fixes, new features, and internal refactoring. The Release Notes include links to detailed descriptions of the changes from the Jira system.

In previous releases, the C, C++, and Fortran libraries have been bundled into a single release package. Beginning with version 4.2, the three libraries are being released as separate packages. Separating the libraries allows the development team to update any of the three independently of the others; this in turn will allow us to make language-specific updates without requiring users to update their environments for features they may not use. Additionally, the separation reduces our maintenance costs by making our build systems simpler.

Another change in this release is migration of most netCDF documentation to the Doxygen system, which puts the source of documentation with the code, and automates some tasks, such as generation of linked HTML documents for the web site.

A new contributed Fortran-2003 interoperability standard for calling C libraries from Fortran solves previously irksome portability problems for developers and users of the Fortran netCDF libraries, allowing use of different Fortran compilers with the same C library. We have made available a new beta-release netCDF Fortran library that incorporates these changes.

A diskless (in-memory) netCDF implementation was completed, though not yet released. This will provide high-performance for operations on netCDF data that can fit in memory, such as changing the shape of multi-dimensional tiles for improved access.


Planned Activities

Ongoing Activities

We will continue the following activities:

  • Respond to C- and Fortran-based netCDF user questions and run netCDF workshops.
  • Incorporate successful features of netCDF-Java into C-based libraries.
  • Improve support for evolving Climate and Forecast (CF) conventions.
  • Improve support for netCDF on various platforms.
  • Deal with needs of a growing user community for representing observational data, satellite products, and geoinformatics data.

New Activities

For the short-term (six months or less), we'll be concentrating on improving Windows support, improving compression to be competitive with GRIB2, finishing conversion of documentation to Doxygen, releasing the Fortran-2003 netCDF-4 software, and turning netCDF into an open-source collaboration by moving to a distributed repository and continuous integration system.

We are beginning to enter long-term netCDF development plans into the netCDF-C project site of Jira, where tasks, issues, and progress may be followed transparently by users and other developers. We now regularly refer users to specific Jira tickets for tracking resolution of bugs or issues in which they have an interest.

Relevant Metrics

During the last year, there were about 85,000 downloads from 117 countries of the C-based netCDF software from Unidata, in addition to downloads from mirror sites, package management systems, and incorporation into other software packages. More detailed metrics are available.

Other metrics that may be useful include:

  • Number of open-source software packages that can use netCDF data: 98
  • Number of commercial or licensed software packages that can use netCDF data: 21
  • Number of Google hits for "netcdf": 3,490,000
  • Number of Google images for "netcdf": 60,800
  • Number of Google scholar entries for "netcdf": 7,480
  • Number of blog mentions of netCDF: 15,200
  • Google count of number of books containing the term "netcdf": 6,390
  • Amazon count of number of books relevant to the subject "netcdf": 23
  • Number of videos tagged as netCDF-relevant: 837
  • Number of netcdf license plates in Colorado: 1