|
|
|
|
NetCDF-4.0 Requirements
These requirements represent our understanding of
the netCDF-4.0 library. They are subject to change without notice.
Comments are welcome, and should be sent to the netcdf-hdf
mailing list: netcdf-hdf@unidata.ucar.edu
|
|
|
|
|
Backward Compatibility
|
- The entire netCDF-3.x API is supported. It can
read/write/modify netcdf-4/HDF5 data files.
- The entire netCDF-2.x API is also
fully supported.
- The netCDF-3 Fortran 77, Fortran 90, and C++ APIs can read and
write data in classic, 64-bit offset, or simple netCDF-4
formats. These APIs cannot yet access new netCDF-4 functionality.
|
|
Support for Large Files
|
- Relaxation of 2 GB classic file format limits, on systems which support
large files. Such systems are detected at install time, and the netCDF
library is built with the _FILE_OFFSET_BITS=64 macro, on those
systems, if required.
- Users on systems that don't support LFS are warned at install time.
|
|
Use of HDF5 for Storage
|
- NetCDF-4 uses HDF5 to store data.
- NetCDF-4 provides a complete facade. No HDF5 artifacts are used in
the public netCDF-4 API. No mixing of the APIs takes place.
|
|
Backward File Format Compatibility
|
- NetCDF 4 uses HDF 5 as its storage layer. It produces valid HDF5
files, which can be read without using the netcdf interface. HDF5
files produced with netCDF-4 should be modified using the netCDF-4
interface.
- NetCDF 4 can also create/read/modify files created with previous
versions of netCDF, using the original netCDF data format.
- If the user opens an old netCDF file, and attempts to modify it,
NetCDF 4 will stick with the (old) netCDF file format. New API
features (like adding groups) won't work on these files.
- Files created in netcdf-4 can be restricted to a strict netcdf-3.x
functionality at file creation time. Users of these files will not be
allowed to use additional netcdf-4 features, like multiple unlimited
dimensions, groups, new types, etc.
- There is a way for users to copy an old netCDF file into the new
HDF 5 data format.
|
|
Backward API Compatibility
|
- Programs written for a 3.x version of netCDF can use the netCDF-4
library by relinking.
- NetCDF 3.x error codes remain unchanged. Netcdf-4 adds some new
error codes.
- For netCDF-4 files, nc_redef and nc_enddef are automatically
called as needed.
|
|
Parallel I/O
|
- Parallel I/O reading and writing to netCDF file is supported,
using the HDF5 parallel I/O features.
- The installer specifies whether parallel netCDF is to be used at
install time.
- The parallel I/O features require that the MPI library be
installed.
- The netCDF-4 library (like the HDF5 library) documents its
functions as collective and independent.
- Collective HDF5 calls are not made during independent netCDF-4
operations.
- Demonstrates performance gains (over netcdf-3) in modeling
contexts on advanced architectures.
|
|
Multiple unlimited
dimensions
|
- Variables may use multiple unlimited dimensions.
- Unlimited dimensions need not be shared. That is, different
variables can have different unlimited dimensions.
- The call nc_inq_unlimdim returns the first unlimited dimension. An
additional function returns an array which contains the full list of
unlimited dims.
- Chunking is required in any dataset with one or more unlimited
dimension in NetCDF-4.
|
|
Variable/Dataset Creation Options
|
- Chunking is required in any dataset with one or more unlimited
dimension in HDF5. NetCDF-4 supports setting chunk parameters at
variable creation. The user can optionally select a chunking
algorithm by setting chunkalg to NC_CHUNK_SEQ (to optimize for
sequential access), NC_CHUNK_SUB (for chunk sizes set to favor equally
subsetting in any dimension.
- When the (netcdf-3) function nc_def_var is used, a sequential
chunking algorithm will be used. (Just as if the var had been created
with NC_CHUNK_SEQ).
- The sequential chunking algorithm sets a chunksize of 1 for all
unlimited dimensions, and all other chunksizes to the
size of that dimension, unless the resulting chunksize is greater than
250 KB, in which case subsequent dimensions will be set to 1 until the
chunksize is less than 250 KB (one quarter of the default chunk cache
size).
- The subsetting chunking algorithm sets the chunksize in each
dimension to the nth root of (desired chunksize/product of n
dimsizes).
|
|
Data Types
|
- The following new atomic data types are supported: unsigned int8,
unsigned int16, unsigned int32, signed and unsigned int64.
- Attempting to use any of the new types with a netCDF-3 Classic or
64-bit Offset format file will return an error.
- Enums are supported.
- Users can define structs composed of atomic types, or previously
defined types. We call these compound types.
- Compound types have a name, which is written to the data file, and
a typeid, which is assigned when the file is read, or the type is
created. Unlike atomic types, the typeid for a compound type may
change when the file is later reopened.
- Users can read files with unknown compound types, and use netCDF-4
functions to learn about the unknown compound types.
- Users can retrieve arrays of the compound type, and also arrays of
any element of the compound type.
- The usual var/var1/vara/vars/varm functions are available for
compound types.
- Automatic data conversion for compound types is not attempted.
- Compound types may be defined in a file, even if they are not used
for any variables.
- A string data type is supported.
- Strings are stored in UTF-8 Unicode.
- String data is stored without being interpreted by the library, but an
encoding for Unicode strings may be specified with a separate
attribute (e.g. "_Encoding"). A global or group attribute could be
used to specify the encoding of all strings in a file or group.
- A variable length (vlen) type can be used to hold ragged arrays.
- Automatic data conversion for vlens is not attempted.
- The user can create named opaque types, with a fixed size.
- Automatic data conversion for opaque types is not attempted.
|
|
Hierarchical Grouping of Data
|
- NetCDF-4 users can further organize their data file items
(i.e. variables and attributes) into groups.
- Groups can be nested.
- An item can belong to only one group.
- The ncid used by netCDF functions refers to both the file and the
group within the file. Two groups in the same file will have different
ncids.
- Names are unique within a group. All netCDF-4 names (including
group names) have maximum length NC_MAX_NAME. (Not including NULL
terminator).
- Users inquire about objects in a group.
- Users can iterate through the groups of multiple simultaneously
open files.
- Attributes can be attached to groups.
- Users can create vars and dims in groups.
- Dims are scoped such that dims in parent, grandparent, etc.,
groups are available to be used as dimensions.
|
|
Limited Interoperability with HDF5
|
- NetCDF-4 produces valid HDF5 files, with no special netCDF-4
artifacts.
- NetCDF-4 can read and edit HDF5 files which meet the following
conditions:
- Dimension scales must be used, and all dimensions of every dataset
must have an attached dimension scale (except for the extra, private,
dimension of a VLEN type).
- Group organization must be strictly hierarchical. No circular
group structure is allowed.
- Only HDF5 atomic types which have a clear correspondence with a
netCDF-4 type are supported.
- As long as they are don't use a forbidden atomic type, compound,
vlen, and opaque types are interoperable.
- Object names must be valid netCDF names (i.e. alpha-numeric or
"_", ".", "+", or "-". No spaces!
|
|
Compression and Other Filters
|
- HDF5 deflate and compress filters are supported.
- Compression can be applied on a per-variable basis.
|
|
Private Dimensions
|
- A dimension can be marked private to one variable. Thereafter, it
can only be used by that variable.
|
|
Documentation
|
- A new version of the netCDF documentation includes updates to
cover all new features.
- Documents are available on the web (HTML), as PDF files, as dvi,
and as postscript files.
- Each language interface is described separately. That is, the
Fortran and C manuals are not mixed together.
- Each language interface document contains examples in native
language. For example, the C manual has C examples, C++ has C++
examples, etc.
- Each manual contains a good index which allows users to quickly
find any function or concept.
- The web site contains a search engine that will allow users to
search any subset of netcdf documents.
|
|
Distribution and Installation
|
- The netCDF-4 library is distributed separately from HDF5, by
Unidata.
- It is possible to configure the installation so that netcdf-4 is
not built into the library. In this case, only netcdf-3 code is built,
and netcdf-4 format files cannot be created or opened.
- After the release, binary distributions are supplied for tier 1
test platforms.
- Cross-compiling is not supported.
- An installation document, always available on the website, and
distributed with netCDF, describes the installation process and lists
the supported platforms.
- Configure and build output for tier 1 platforms are available
to help troubleshoot installation problems.
- The netCDF-4 library requires that HDF5 (version 1.8.0 or greater)
be installed.
- The netCDF-4 library can coexist peacefully with the netCDF-3
library, but both cannot be used in the same program due to name
clashes.
- Unix binary users get a tarball containing library .a file for
their platform, the man pages, and binary executables of ncdump and
ncgen. The Fortran interface is included in the library, and the
(currently implemented) C++ or F90 interfaces for tier 1 platforms.
- Unix users build and install netcdf-4 in one pass through the usual
configure/make test/make install cycle
- The configure script searches the user's path for compilers,
preferring multi-platform commercial compilers, then platform-specific
commercial compilers, then GNU compilers. (Based on the assumption
that if they've paid for a compiler, and included it in their path,
they want to use it).
- The configure script attempts to correctly set flags like
CPPFLAGS, CFLAGS, etc., in accordance with the needs of the
platform. If the user has set CFLAGS, FFLAGS (for F77), FCFLAGS (for
Fortran), or CXXFLAGS, the configure script will not override these
settings. (Note that autoconf does not always follow these
conventions, and we are not going to try to make it do so.) A
configure option allows the user to turn off all netCDF-4 attempts to
change or set any flags.
- If no CFLAGS are specified, -g is used. (Setting CFLAGS to null
means no CFLAGS will be used.)
- By default, configure builds F77, F90, and C++ APIs, if it can
find a compiler to do so. The user can optionally disable these APIs.
- Windows binary users download a setup.exe file from Unidata and
launch the GUI installer for Windows.
- Windows source code users download a source code winzip file, and
find windows dependent files and IDE configurations under the win32
directory.
- NetCDF is buildable from MS Visual Studio, with the two
most recent releases of VC++. (Version 6.0 and 7.0, for the purposes
of the netCDF-4 project).
- Windows users can also use cygwin tools to build netcdf with gcc.
- Not supported: Building with GNU ming and using configure script
to find and use MS VC.
|
|
Upgrades to ncdump/CDL to reflect new
features
|
- New data types, including structs, are supported in CDL and in
ncdump.
- NcML is supported in ncdump.
|
|
Testing
|
- All API functions are tested.
- All tests can run on any supported target platform.
- All tests are automated and can be run from makefile targets.
- Some tests make take excessive time. Make test may skip very
lengthy tests (i.e. tests that take tens of minutes on the average
Linux workstation), and the user can use make slowtests to run them.
- Running "make test" or "make check" clearly indicates success or
failure at the end of the output. The return code from make indicates
success of all tests (0) or failure of any test (2).
- All supported language interfaces are tested. These include C,
C++, Fortran 77 and Fortran 90.
- For release purposes we identify the following two tiers of
testing:
- Portland Group compiler on Linux, one vendor compiler and
gcc/g77/g++ on AIX, Linux, HP-UX (w/o C++,F90), MacOS, and Solaris
(32-bit SPARC and i386, and 64-bit SPARC mode), and vendor compiler
(i.e. VC++ 7.0, a.k.a. VC++.NET) on Windows.
- FreeBSD, Irix64, OSF1 vendor and GNU compilers. Cygwin.
- We add the following for parallel programming support:
- AIX 5.1 and higher, Linux Cluster/MPICH
- Test programs do not accept or require command line options.
- Test programs provide feedback in accordance with netCDF test look
and feel.
|
|
Performance
|
- The performance of netCDF-4 in not more than 10% slower than
netCDF-3 for large contiguous data writes. It is not more than 100%
slower for any other operation.
|
|
Examples
|
- Example programs will demonstrate many of the features of netCDF.
- Each example will consist of a single code file, which the user
can easily paste from the web page into their netCDF development
environment.
- Examples include reasonable error handling.
- Examples are as brief as good coding style permits.
- Example code is well-commented.
- All examples can be built and run as one make target ("make
check").
- Examples are distributed with netCDF and compile and run on all
supported netCDF systems.
- Corresponding examples will be provided in C, F77, F90, and C++.
- Examples do not depend on each other, but may depend on a common
real data file.
- Some examples illustrate netCDF features in realistic ways as used
by the Earth Science community.
- Examples are written for programmers who are familiar with the
target language. The C examples are written for C programmers, not
people who don't know C. The same applies to the other languages.
|
|
|
| |