[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: NetCDF development



Hi Bert,

> Hi, I'm a programmer at the Brain Imaging Centre of the Montreal
> Neurological Institute.
> 
> As you may know, we make extensive use of NetCDF as the basis of our
> "MINC" (Medical Imaging NetCDF) data format.  We have a whole suite of
> tools for manipulating and viewing neuroimaging data in this format.

I'm aware of and impressed by your MINC format and tools.

> We're interested in defining some major new features, which may require a
> number of changes or additions to the underlying format.  Some of these
> features require support for huge files (approaching a terabyte), sparse
> volumes, data compression, and block structured data.

There is current limited support for large files that permit
terabyte-size netCDF files as long as some specific constraints are
met:

  http://www.unidata.ucar.edu/packages/netcdf/faq.html#lfs

I'd be interested in more details of your requirements for sparse
volumes and block structured data.

> I've read the 1997 document on the UCAR website which describes the status
> and plans for NetCDF 4.0.  Can you tell me anything about the state of
> development of NetCDF 4 today?
> 
> Is there any provision to make beta software available?

The latest beta release is netCDF-3.5.1-beta10 from

  ftp://ftp.unidata.ucar.edu/pub/netcdf/netcdf-3.5.1-beta10.tar.Z

The 1997 plans were postponed due to higher priorities, and we still
haven't gotten funding or resources for netCDF 4.  Our latest attempt
to get the necessary resources is a proposal to NASA for a joint
Unidata/NCSA collaboration to build netCDF4 on HDF5:

  http://www.unidata.ucar.edu/proposals/NASA-NRA-2002russ/description.pdf

NASA has said they would announce selections for the solicitation for
which that proposal was written "by mid-February", but I still haven't
heard anything.

We have also recently submitted a five-year proposal to the National
Science Foundation for Unidata support that includes the appended
section on netCDF development (as well as lots of other development
projects).  I'll be attending a review panel for this in April.  If it
all gets funded, we should be able to resume netCDF development soon.

Note that these proposal excerpts should probably be considered 
confidential, at least until we learn whether they get awarded.

We currently have 0.25 FTE assigned to netCDF support and development,
which is just enough to cover around 450 support questions/year and a
glacial pace of development and documentation improvements.

--Russ

_____________________________________________________________________

Russ Rew                                         UCAR Unidata Program
address@hidden                     http://www.unidata.ucar.edu


Endeavor 6: Improved scientific data access infrastructure 

Whether the datasets are delivered to local systems via the IDD or
accessed from remote servers using THREDDS, a key enabling component
is the data access interface. One of the most commonly used data
interfaces is Unidata's netCDF. The UPC has also developed expertise
with many other interfaces and formats for a wide variety of new data
types, by providing software to convert data from new sources into
forms that are easy for applications to analyze and visualize, and by
providing new technologies for remote data access.

Extending current activities 

A key Unidata effort has been developing and nurturing netCDF, a data
model, data format, and set of libraries for access to scientific data
and metadata by data providers and application developers. NetCDF data
is self-describing, platform-independent, directly accessible,
efficiently appendable, and shareable. NetCDF libraries for C, C++,
Fortran77, Fortran90, Perl, MATLAB, Java, and Python support access to
data in numerous open-source and commercial software packages for
analyzing and visualizing scientific data. Various research projects
in the geosciences have adopted netCDF as a standard for data access
and archives, and the recent translation of the netCDF User Guides
into Japanese at Kyoto University indicates its international
reach. Unidata is uniquely qualified to continue to evolve and support
this software for representing and accessing scientific data as one of
the most fundamental components of cyberinfrastructure.

New activities augmenting and enhancing the program 

Users of netCDF on high-end parallel platforms and with
high-resolution models have begun to encounter several limitations of
the software, which, given the pace of advances in computing, will
soon be limitations for desktop users as well. These include dataset
sizes permitted by netCDF, I/O bottlenecks in programs on parallel
computers, and difficulties interoperating with other data interfaces
and formats. Given netCDF s status as a widely used standard for data
access in the geosciences, we must overcome these limitations.

Toward that goal, we propose to advance netCDF with: 

 - Better library support for transparent, flexible packing of
   limited-resolution data, so datasets may be stored compactly for
   rapid access

 - The use of parallel I/O on multiprocessors, so that data access is
   not the primary bottleneck preventing advances in modeling and
   visualization

 - The implementation of a netCDF interface over an alternate format
   (such as HDF5), to remove some limitations with the current format

 - Further development of netCDF server technology, so remote data
   access becomes almost as simple as local data access and so that
   retrieving small subsets of large remote datasets is practical

 - Standard XML representations for netCDF data aggregations, added
   metadata, and derived data, to support third-party metadata and
   virtual datasets

 - Efficient mechanisms to append new data to existing datasets along
   multiple dimensions

In addition to these improvements, users need access to higher-level
data objects with richer semantics than simple typed multidimensional
arrays. For example, VisAD's data model is richer (and more complex)
than netCDF's, representing arbitrary finite samples of continuous
functions. Recent advances in the use of databases for efficiently
storing and manipulating gridded data promise benefits for scientific
applications. We propose to enhance the next-generation data access
infrastructure available to Unidata applications to provide:

 - High-level object representations for Grid, Image, Profile, Point
   Observation, and Sequence

 - The ability to directly represent metadata currently encoded in
   file format conventions, so that applications may use metadata
   without reference to specific conventions

 - More complete XML representations, an advantage for
   interoperability with the growing set of useful web services

 - The ability to represent GIS structures, enabling use of natural
   science data in GIS applications

 - An implementation allowing efficient access to more flexibly
   specified subsets of data, to support user-level data queries

 - Improved interoperability with other representations for scientific
   data, so that applications can be independent of data format