Next: , Previous: NetCDF Classic Format, Up: NetCDF Classic Format


The Format in Detail

     netcdf_file  = header  data
     header       = magic  numrecs  dim_list  gatt_list  var_list
     magic        = 'C'  'D'  'F'  VERSION
     VERSION      = \x01 |                      // classic format
                    \x02                        // 64-bit offset format
     numrecs      = NON_NEG | STREAMING         // length of record dimension
     dim_list     = ABSENT | NC_DIMENSION  nelems  [dim ...]
     gatt_list    = att_list                    // global attributes
     att_list     = ABSENT | NC_ATTRIBUTE  nelems  [attr ...]
     var_list     = ABSENT | NC_VARIABLE   nelems  [var ...]
     ABSENT       = ZERO  ZERO                  // Means list is not present
     ZERO         = \x00 \x00 \x00 \x00         // 32-bit zero
     NC_DIMENSION = \x00 \x00 \x00 \x0A         // tag for list of dimensions
     NC_VARIABLE  = \x00 \x00 \x00 \x0B         // tag for list of variables
     NC_ATTRIBUTE = \x00 \x00 \x00 \x0C         // tag for list of attributes
     nelems       = NON_NEG       // number of elements in following sequence
     dim          = name  dim_length
     name         = nelems  namestring
                         // Names a dimension, variable, or attribute.
                         // Names should match the regular expression
                         // ([a-zA-Z0-9_]|{MUTF8})([^\x00-\x1F/\x7F-\xFF]|{MUTF8})*
                         // For other constraints, see "Note on names", below.
     namestring   = ID1 [IDN ...] padding
     ID1          = alphanumeric | '_'
     IDN          = alphanumeric | special1 | special2
     alphanumeric = lowercase | uppercase | numeric | MUTF8
     lowercase    = 'a'|'b'|'c'|'d'|'e'|'f'|'g'|'h'|'i'|'j'|'k'|'l'|'m'|
                    'n'|'o'|'p'|'q'|'r'|'s'|'t'|'u'|'v'|'w'|'x'|'y'|'z'
     uppercase    = 'A'|'B'|'C'|'D'|'E'|'F'|'G'|'H'|'I'|'J'|'K'|'L'|'M'|
                    'N'|'O'|'P'|'Q'|'R'|'S'|'T'|'U'|'V'|'W'|'X'|'Y'|'Z'
     numeric      = '0'|'1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9'
                                  // special1 chars have traditionally been
                                  // permitted in netCDF names.
     special1     = '_'|'.'|'@'|'+'|'-'
                                  // special2 chars are recently permitted in
                                  // names (and require escaping in CDL).
                                  // Note: '/' is not permitted.
     special2     = ' ' | '!' | '"' | '#'  | '$' | '%' | '&' | '\'' |
                    '(' | ')' | '*' | ','  | ':' | ';' | '<' | '='  |
                    '>' | '?' | '[' | '\\' | ']' | '^' | '`' | '{'  |
                    '|' | '}' | '~'
     MUTF8        = <multibyte UTF-8 encoded, NFC-normalized Unicode character>
     dim_length   = NON_NEG       // If zero, this is the record dimension.
                                  // There can be at most one record dimension.
     attr         = name  nc_type  nelems  [values ...]
     nc_type      = NC_BYTE   |
                    NC_CHAR   |
                    NC_SHORT  |
                    NC_INT    |
                    NC_FLOAT  |
                    NC_DOUBLE
     var          = name  nelems  [dimid ...]  vatt_list  nc_type  vsize  begin
                                  // nelems is the dimensionality (rank) of the
                                  // variable: 0 for scalar, 1 for vector, 2
                                  // for matrix, ...
     dimid        = NON_NEG       // Dimension ID (index into dim_list) for
                                  // variable shape.  We say this is a "record
                                  // variable" if and only if the first
                                  // dimension is the record dimension.
     vatt_list    = att_list      // Variable-specific attributes
     vsize        = NON_NEG       // Variable size.  If not a record variable,
                                  // the amount of space in bytes allocated to
                                  // the variable's data.  If a record variable,
                                  // the amount of space per record.  See "Note
                                  // on vsize", below.
     begin        = OFFSET        // Variable start location.  The offset in
                                  // bytes (seek index) in the file of the
                                  // beginning of data for this variable.
     data         = non_recs  recs
     non_recs     = [vardata ...] // The data for all non-record variables,
                                  // stored contiguously for each variable, in
                                  // the same order the variables occur in the
                                  // header.
     vardata      = [values ...]  // All data for a non-record variable, as a
                                  // block of values of the same type as the
                                  // variable, in row-major order (last
                                  // dimension varying fastest).
     recs         = [record ...]  // The data for all record variables are
                                  // stored interleaved at the end of the
                                  // file.
     record       = [varslab ...] // Each record consists of the n-th slab
                                  // from each record variable, for example
                                  // x[n,...], y[n,...], z[n,...] where the
                                  // first index is the record number, which
                                  // is the unlimited dimension index.
     varslab      = [values ...]  // One record of data for a variable, a
                                  // block of values all of the same type as
                                  // the variable in row-major order (last
                                  // index varying fastest).
     values       = bytes | chars | shorts | ints | floats | doubles
     string       = nelems  [chars]
     bytes        = [BYTE ...]  padding
     chars        = [CHAR ...]  padding
     shorts       = [SHORT ...]  padding
     ints         = [INT ...]
     floats       = [FLOAT ...]
     doubles      = [DOUBLE ...]
     padding      = <0, 1, 2, or 3 bytes to next 4-byte boundary>
                                  // Header padding uses null (\x00) bytes.  In
                                  // data, padding uses variable's fill value.
                                  // See "Note on padding", below, for a special
                                  // case.
     NON_NEG      = <non-negative INT>
     STREAMING    = \xFF \xFF \xFF \xFF   // Indicates indeterminate record
                                          // count, allows streaming data
     OFFSET       = <non-negative INT> |  // For classic format or
                    <non-negative INT64>  // for 64-bit offset format
     BYTE         = <8-bit byte>          // See "Note on byte data", below.
     CHAR         = <8-bit byte>          // See "Note on char data", below.
     SHORT        = <16-bit signed integer, Bigendian, two's complement>
     INT          = <32-bit signed integer, Bigendian, two's complement>
     INT64        = <64-bit signed integer, Bigendian, two's complement>
     FLOAT        = <32-bit IEEE single-precision float, Bigendian>
     DOUBLE       = <64-bit IEEE double-precision float, Bigendian>
                                  // following type tags are 32-bit integers
     NC_BYTE      = \x00 \x00 \x00 \x01       // 8-bit signed integers
     NC_CHAR      = \x00 \x00 \x00 \x02       // text characters
     NC_SHORT     = \x00 \x00 \x00 \x03       // 16-bit signed integers
     NC_INT       = \x00 \x00 \x00 \x04       // 32-bit signed integers
     NC_FLOAT     = \x00 \x00 \x00 \x05       // IEEE single precision floats
     NC_DOUBLE    = \x00 \x00 \x00 \x06       // IEEE double precision floats
                                  // Default fill values for each type, may be
                                  // overridden by variable attribute named
                                  // `_FillValue'. See "Note on fill values",
                                  // below.
     FILL_CHAR    = \x00                      // null byte
     FILL_BYTE    = \x81                      // (signed char) -127
     FILL_SHORT   = \x80 \x01                 // (short) -32767
     FILL_INT     = \x80 \x00 \x00 \x01       // (int) -2147483647
     FILL_FLOAT   = \x7C \xF0 \x00 \x00       // (float) 9.9692099683868690e+36
     FILL_DOUBLE  = \x47 \x9E \x00 \x00 \x00 \x00 //(double)9.9692099683868690e+36

Note on vsize: This number is the product of the dimension lengths (omitting the record dimension) and the number of bytes per value (determined from the type), increased to the next multiple of 4, for each variable. If a record variable, this is the amount of space per record (except that, for backward compatibility, it always includes padding to the next multiple of 4 bytes, even in the exceptional case noted below under “Note on padding”). The netCDF “record size” is calculated as the sum of the vsize's of all the record variables.

The vsize field is actually redundant, because its value may be computed from other information in the header. The 32-bit vsize field is not large enough to contain the size of variables that require more than 2^32 - 4 bytes, so 2^32 - 1 is used in the vsize field for such variables.

Note on names: Earlier versions of the netCDF C-library reference implementation enforced a more restricted set of characters in creating new names, but permitted reading names containing arbitrary bytes. This specification extends the permitted characters in names to include multi-byte UTF-8 encoded Unicode and additional printing characters from the US-ASCII alphabet. The first character of a name must be alphanumeric, a multi-byte UTF-8 character, or '_' (reserved for special names with meaning to implementations, such as the “_FillValue” attribute). Subsequent characters may also include printing special characters, except for '/' which is not allowed in names. Names that have trailing space characters are also not permitted.

Implementations of the netCDF classic and 64-bit offset format must ensure that names are normalized according to Unicode NFC normalization rules during encoding as UTF-8 for storing in the file header. This is necessary to ensure that gratuitous differences in the representation of Unicode names do not cause anomalies in comparing files and querying data objects by name.

Note on streaming data: The largest possible record count, 2^32 - 1, is reserved to indicate an indeterminate number of records. This means that the number of records in the file must be determined by other means, such as reading them or computing the current number of records from the file length and other information in the header. It also means that the numrecs field in the header will not be updated as records are added to the file. [This feature is not yet implemented].

Note on padding: In the special case when there is only one record variable and it is of type character, byte, or short, no padding is used between record slabs, so records after the first record do not necessarily start on four-byte boundaries. However, as noted above under “Note on vsize”, the vsize field is computed to include padding to the next multiple of 4 bytes. In this case, readers should ignore vsize and assume no padding. Writers should store vsize as if padding were included.

Note on byte data: It is possible to interpret byte data as either signed (-128 to 127) or unsigned (0 to 255). When reading byte data through an interface that converts it into another numeric type, the default interpretation is signed. There are various attribute conventions for specifying whether bytes represent signed or unsigned data, but no standard convention has been established. The variable attribute “_Unsigned” is reserved for this purpose in future implementations.

Note on char data: Although the characters used in netCDF names must be encoded as UTF-8, character data may use other encodings. The variable attribute “_Encoding” is reserved for this purpose in future implementations.

Note on fill values: Because data variables may be created before their values are written, and because values need not be written sequentially in a netCDF file, default “fill values” are defined for each type, for initializing data values before they are explicitly written. This makes it possible to detect reading values that were never written. The variable attribute “_FillValue”, if present, overrides the default fill value for a variable. If _FillValue is defined then it should be scalar and of the same type as the variable.

Fill values are not required, however, because netCDF libraries have traditionally supported a “no fill” mode when writing, omitting the initialization of variable values with fill values. This makes the creation of large files faster, but also eliminates the possibility of detecting the inadvertent reading of values that haven't been written.