NetCDF  4.4.1.1
NetCDF Utilities

CDL Guide

CDL Syntax

Below is an example of CDL, describing a netCDF classic format file with several named dimensions (lat, lon, time), variables (z, t, p, rh, lat, lon, time), variable attributes (units, _FillValue, valid_range), and some data.

netcdf foo { // example netCDF specification in CDL
dimensions:
lat = 10, lon = 5, time = unlimited;
variables:
int lat(lat), lon(lon), time(time);
float z(time,lat,lon), t(time,lat,lon);
double p(time,lat,lon);
int rh(time,lat,lon);
lat:units = "degrees_north";
lon:units = "degrees_east";
time:units = "seconds";
z:units = "meters";
z:valid_range = 0., 5000.;
p:_FillValue = -9999.;
rh:_FillValue = -1;
data:
lat = 0, 10, 20, 30, 40, 50, 60, 70, 80, 90;
lon = -140, -118, -96, -84, -52;
}

All CDL statements are terminated by a semicolon. Spaces, tabs, and newlines can be used freely for readability. Comments may follow the double slash characters '//' on any line.

A CDL description for a classic model file consists of three optional parts: dimensions, variables, and data. The variable part may contain variable declarations and attribute assignments. For the enhanced model supported by netCDF-4, a CDL decription may also includes groups, subgroups, and user-defined types.

A dimension is used to define the shape of one or more of the multidimensional variables described by the CDL description. A dimension has a name and a length. At most one dimension in a classic CDL description can have the unlimited length, which means a variable using this dimension can grow to any length (like a record number in a file). Any number of dimensions can be declared of unlimited length in CDL for an enhanced model file.

A variable represents a multidimensional array of values of the same type. A variable has a name, a data type, and a shape described by its list of dimensions. Each variable may also have associated attributes (see below) as well as data values. The name, data type, and shape of a variable are specified by its declaration in the variable section of a CDL description. A variable may have the same name as a dimension; by convention such a variable contains coordinates of the dimension it names.

An attribute contains information about a variable or about the whole netCDF dataset or containing group. Attributes may be used to specify such properties as units, special values, maximum and minimum valid values, and packing parameters. Attribute information is represented by single values or one-dimensional arrays of values. For example, “units” might be an attribute represented by a string such as “celsius”. An attribute has an associated variable, a name, a data type, a length, and a value. In contrast to variables that are intended for data, attributes are intended for ancillary data or metadata (data about data).

In CDL, an attribute is designated by a variable and attribute name, separated by a colon (':'). It is possible to assign global attributes to the netCDF dataset as a whole by omitting the variable name and beginning the attribute name with a colon (':'). The data type of an attribute in CDL, if not explicitly specified, is derived from the type of the value assigned to it. In the netCDF-4 enhanced model, attributes may be declared to be of user-defined type, like variables.

The length of an attribute is the number of data values assigned to it. Multiple values are assigned to non-character attributes by separating the values with commas (','). All values assigned to an attribute must be of the same type. In the classic data model, character arrays are used for textual information. The length of a character attribute is the number of bytes, and an array of character values can be represented in string notation. In the enhanced data model of netCDF-4, variable-length strings are available as a primitive type, and the length of a string attribute is the number of string values assigned to it.

In CDL, just as for netCDF, the names of dimensions, variables and attributes (and, in netCDF-4 files, groups, user-defined types, compound member names, and enumeration symbols) consist of arbitrary sequences of alphanumeric characters, underscore '_', period '.', plus '+', hyphen '-', or at sign '@', but beginning with a letter or underscore. However names commencing with underscore are reserved for system use. Case is significant in netCDF names. A zero-length name is not allowed. Some widely used conventions restrict names to only alphanumeric characters or underscores. Names that have trailing space characters are also not permitted.

Beginning with versions 3.6.3 and 4.0, names may also include UTF-8 encoded Unicode characters as well as other special characters, except for the character '/', which may not appear in a name (because it is reserved for path names of nested groups). In CDL, most special characters are escaped with a backslash '\' character, but that character is not actually part of the netCDF name. The special characters that do not need to be escaped in CDL names are underscore '_', period '.', plus '+', hyphen '-', or at sign '@'. The formal specification of CDL name syntax is provided in the classic format specification (see NetCDF Classic Format). Note that by using special characters in names, you may make your data not compliant with conventions that have more stringent requirements on valid names for netCDF components, for example the CF Conventions.

The names for the primitive data types are reserved words in CDL, so names of variables, dimensions, and attributes must not be primitive type names.

The optional data section of a CDL description is where netCDF variables may be initialized. The syntax of an initialization is simple:

variable = value_1, value_2, ... ;

The comma-delimited list of constants may be separated by spaces, tabs, and newlines. For multidimensional arrays, the last dimension varies fastest. Thus, row-order rather than column order is used for matrices. If fewer values are supplied than are needed to fill a variable, it is extended with the fill value. The types of constants need not match the type declared for a variable; coercions are done to convert integers to floating point, for example. All meaningful type conversions among numeric primitive types are supported.

A special notation for fill values is supported: the ‘_’ character designates a fill value for variables.

CDL Data Types

The CDL primitive data types for the classic model are:

  • char - Characters.
  • byte - Eight-bit integers.
  • short - 16-bit signed integers.
  • int - 32-bit signed integers.
  • long - (Deprecated, synonymous with int)
  • float - IEEE single-precision floating point (32 bits).
  • real - (Synonymous with float).
  • double - IEEE double-precision floating point (64 bits).

NetCDF-4 supports the additional primitive types:

  • ubyte - Unsigned eight-bit integers.
  • ushort - Unsigned 16-bit integers.
  • uint - Unsigned 32-bit integers.
  • int64 - 64-bit singed integers.
  • uint64 - Unsigned 64-bit singed integers.
  • string - Variable-length string of characters

Except for the added numeric data-types byte and ubyte, CDL supports the same numeric primitive data types as C. For backward compatibility, in declarations primitive type names may be specified in either upper or lower case.

The byte type differs from the char type in that it is intended for numeric data, and the zero byte has no special significance, as it may for character data. In the classic data model, byte data could be interpreted as either signed (-128 to 127) or unsigned (0 to 255). When reading byte data in a way that converts it into another numeric type, the default interpretation is signed. The netCDF-4 enhanced data model added an unsigned byte type.

The short type holds values between -32768 and

  1. The ushort type holds values between 0 and 65536. The int type can hold values between -2147483648 and 2147483647. The uint type holds values between 0 and 4294967296. The int64 type can hold values between -9223372036854775808 and 9223372036854775807. The uint64 type can hold values between 0 and 18446744073709551616.

The float type can hold values between about -3.4+38 and 3.4+38, with external representation as 32-bit IEEE normalized single-precision floating-point numbers. The double type can hold values between about -1.7+308 and 1.7+308, with external representation as 64-bit IEEE standard normalized double-precision, floating-point numbers. The string type holds variable length strings.

A netCDF-4 string is a variable length array of Unicode http://unicode.org/ characters. When reading/writing a String to a netCDF file or other external representation, the characters are UTF-8 encoded http://en.wikipedia.org/wiki/UTF-8 (note that ASCII is a subset of UTF-8). Libraries may use different internal representations, for example the Java library uses UTF-16 encoding.

The netCDF char type contains uninterpreted characters, one character per byte. Typically these contain 7-bit ASCII characters, but the character encoding is application specific. For this reason, applications writing data using the enhanced data model are encouraged to use the netCDF-4 string data type in preference to the char data type. Applications writing string data using the char data type are encouraged to add the special variable attribute "_Encoding" with a value that the netCDF libraries recognize. Currently those valid values are "UTF-8" or "ASCII", case insensitive.

CDL Notation for Data Constants

This section describes the CDL notation for constants.

Attributes are initialized in the variables section of a CDL description by providing a list of constants that determines the attribute's length and type (if primitive and not explicitly declared). CDL defines a syntax for constant values that permits distinguishing among different netCDF primitive types. The syntax for CDL constants is similar to C syntax, with type suffixes appended to bytes, shorts, and floats to distinguish them from ints and doubles.

A byte constant is represented by an integer constant with a 'b' (or 'B') appended. In the old netCDF-2 API, byte constants could also be represented using single characters or standard C character escape sequences such as 'a' or '
'. This is still supported for backward compatibility, but deprecated to make the distinction clear between the numeric byte type and the textual char type. Example byte constants include:

0b // a zero byte
-1b // -1 as an 8-bit byte
255b // also -1 as a signed 8-bit byte

Character constants are enclosed in double quotes. A character array may be represented as a string enclosed in double quotes. Multiple CDL strings are concatenated into a single array of characters, permitting long character arrays to appear on multiple lines. To support multiple variable-length textual values, a conventional delimiter such as ',' or blank may be used, but interpretation of any such convention for a delimiter must be implemented in software above the netCDF library layer. The usual escape conventions for C strings are honored. For example:

"a" // ASCII 'a'
"Two\nlines\n" // a 10-character string with two embedded newlines
"a bell:\007" // a character array containing an ASCII bell
"ab","cde" // the same as "abcde"

The form of a short constant is an integer constant with an 's' or 'S' appended. If a short constant begins with '0', it is interpreted as octal. When it begins with '0x', it is interpreted as a hexadecimal constant. For example:

2s // a short 2
0123s // octal
0x7ffs // hexadecimal

The form of an int constant is an ordinary integer constant. If an int constant begins with '0', it is interpreted as octal. When it begins with '0x', it is interpreted as a hexadecimal constant. Examples of valid int constants include:

-2
0123 // octal
0x7ff // hexadecimal
1234567890L // deprecated, uses old long suffix

The float type is appropriate for representing data with about seven significant digits of precision. The form of a float constant is the same as a C floating-point constant with an 'f' or 'F' appended. A decimal point is required in a CDL float to distinguish it from an integer. For example, the following are all acceptable float constants:

-2.0f
3.14159265358979f // will be truncated to less precision
1.f
.1f

The double type is appropriate for representing floating-point data with about 16 significant digits of precision. The form of a double constant is the same as a C floating-point constant. An optional 'd' or 'D' may be appended. A decimal point is required in a CDL double to distinguish it from an integer. For example, the following are all acceptable double constants:

-2.0
3.141592653589793
1.0e-20
1.d

Unsigned integer constants can be created by appending the character 'U' or 'u' between the constant and any trailing size specifier. Thus one could say 10U, 100us, 100000ul, or 1000000ull, for example.

Constants for the variable-length string type, available as a primitive type in the netCDF-4 enhanced data model are, like character constants, represented using double quotes. This represents a potential ambiguity since a multi-character string may also indicate a dimensioned character value. Disambiguation usually occurs by context, but care should be taken to specify the string type to ensure the proper choice. For example, these two CDL specifications of global attributes have different types:

:att1 = "abcd", "efg" ; // a char attribute of length 7
string :att2 = "abcd", efg" ; // a string attribute of length 2

Opaque constants are represented as sequences of hexadecimal digits preceded by 0X or 0x: 0xaa34ffff, for example. These constants can still be used as integer constants and will be either truncated or extended as necessary.

The ncgen man-page reference has more details about CDL representation of constants of user-defined types.

ncdump

Convert NetCDF file to text form (CDL)

ncdump synopsis

ncdump [-chistxw] [-v var1,...] [-b lang] [-f lang]
[-l len] [-n name] [-p n[,n]] [-g grp1,...] file
ncdump -k file

ncdump description

The ncdump utility generates a text representation of a specified netCDF file on standard output, optionally excluding some or all of the variable data in the output. The text representation is in a form called CDL (network Common Data form Language) that can be viewed, edited, or serve as input to ncgen, a companion program that can generate a binary netCDF file from a CDL file. Hence ncgen and ncdump can be used as inverses to transform the data representation between binary and text representations. See ncgen documentation for a description of CDL and netCDF representations.

ncdump may also be used to determine what kind of netCDF file is used (which variant of the netCDF file format) with the -k option.

If DAP support was enabled when ncdump was built, the file name may specify a DAP URL. This allows ncdump to access data sources from DAP servers, including data in other formats than netCDF. When used with DAP URLs, ncdump shows the translation from the DAP data model to the netCDF data model.

ncdump may also be used as a simple browser for netCDF data files, to display the dimension names and lengths; variable names, types, and shapes; attribute names and values; and optionally, the values of data for all variables or selected variables in a netCDF file. For netCDF-4 files, groups and user-defined types are also included in ncdump output.

ncdump uses '_' to represent data values that are equal to the '_FillValue' attribute for a variable, intended to represent data that has not yet been written. If a variable has no '_FillValue' attribute, the default fill value for the variable type is used unless the variable is of byte type.

ncdump defines a default display format used for each type of netCDF data, but this can be changed if a 'C_format' attribute is defined for a netCDF variable. In this case, ncdump will use the 'C_format' attribute to format each value. For example, if floating-point data for the netCDF variable 'Z' is known to be accurate to only three significant digits, it would be appropriate to use the variable attribute

Z:C_format = "%.3g"

ncdump options

-c
Show the values of coordinate variables (1D variables with the same names as dimensions) as well as the declarations of all dimensions, variables, attribute values, groups, and user-defined types. Data values of non-coordinate variables are not included in the output. This is usually the most suitable option to use for a brief look at the structure and contents of a netCDF file.
-h
Show only the header information in the output, that is, output only the declarations for the netCDF dimensions, variables, attributes, groups, and user-defined types of the input file, but no data values for any variables. The output is identical to using the '-c' option except that the values of coordinate variables are not included. (At most one of '-c' or '-h' options may be present.)
-v var1,...
The output will include data values for the specified variables, in addition to the declarations of all dimensions, variables, and attributes. One or more variables must be specified by name in the comma-delimited list following this option. The list must be a single argument to the command, hence cannot contain unescaped blanks or other white space characters. The named variables must be valid netCDF variables in the input-file. A variable within a group in a netCDF-4 file may be specified with an absolute path name, such as '/GroupA/GroupA2/var'. Use of a relative path name such as 'var' or 'grp/var' specifies all matching variable names in the file. The default, without this option and in the absence of the '-c' or '-h' options, is to include data values for all variables in the output.
-b [c|f]
A brief annotation in the form of a CDL comment (text beginning with the characters '//') will be included in the data section of the output for each 'row' of data, to help identify data values for multidimensional variables. If lang begins with 'C' or 'c', then C language conventions will be used (zero-based indices, last dimension varying fastest). If lang begins with 'F' or 'f', then FORTRAN language conventions will be used (one-based indices, first dimension varying fastest). In either case, the data will be presented in the same order; only the annotations will differ. This option may be useful for browsing through large volumes of multidimensional data.
-f [c|f]
Full annotations in the form of trailing CDL comments (text beginning with the characters '//') for every data value (except individual characters in character arrays) will be included in the data section. If lang begins with 'C' or 'c', then C language conventions will be used. If lang begins with 'F' or 'f', then FORTRAN language conventions will be used. In either case, the data will be presented in the same order; only the annotations will differ. This option may be useful for piping data into other filters, since each data value appears on a separate line, fully identified. (At most one of '-b' or '-f' options may be present.)
-l length
Changes the default maximum line length (80) used in formatting lists of non-character data values.
-n name
CDL requires a name for a netCDF file, for use by 'ncgen -b' in generating a default netCDF file name. By default, ncdump constructs this name from the last component of the file name of the input netCDF file by stripping off any extension it has. Use the '-n' option to specify a different name. Although the output file name used by 'ncgen -b' can be specified, it may be wise to have ncdump change the default name to avoid inadvertently overwriting a valuable netCDF file when using ncdump, editing the resulting CDL file, and using 'ncgen -b' to generate a new netCDF file from the edited CDL file.
-p float_digits[, double_digits ]
Specifies default precision (number of significant digits) to use in displaying floating-point or double precision data values for attributes and variables. If specified, this value overrides the value of the C_format attribute, if any, for a variable. Floating-point data will be displayed with float_digits significant digits. If double_digits is also specified, double-precision values will be displayed with that many significant digits. In the absence of any '-p' specifications, floating-point and double-precision data are displayed with 7 and 15 significant digits respectively. CDL files can be made smaller if less precision is required. If both floating-point and double precisions are specified, the two values must appear separated by a comma (no blanks) as a single argument to the command. (To represent every last bit of precision in a CDL file for all possible floating-point values would requires '-p 9,17'.)
-k
Show kind of netCDF file, that is which format variant the file uses. Other options are ignored if this option is specified. Output will be one of 'classic'. '64-bit offset', 'netCDF-4', or 'netCDF-4 classic model'.
-s
Specifies that special virtual and hidden attributes should be output for the file format variant and for variable properties such as compression, chunking, and other properties specific to the format implementation that are primarily related to performance rather than the logical schema of the data. All the special virtual attributes begin with '_' followed by an upper-case letter. Currently they include the global attributes '_Format', '_NCProperties', '_IsNetcdf4', '_SuperblockVersion' and the variable attributes '_ChunkSizes', '_DeflateLevel', '_Endianness', '_Fletcher32', '_NoFill', '_Shuffle', and '_Storage'. The ncgen utility recognizes these attributes and supports them appropriately. For '_NCProperties', '_IsNetcdf4', and '_SuperblockVersion', the term 'appropriately' means that they are ignored.
-t
Controls display of time data, if stored in a variable that uses a udunits compliant time representation such as 'days since 1970-01-01' or 'seconds since 2009-03-15 12:01:17'. If this option is specified, time values are displayed as a human-readable date-time strings rather than numerical values, interpreted in terms of a 'calendar' variable attribute, if specified. For numeric attributes of time variables, the human-readable time value is displayed after the actual value, in an associated CDL comment. Calendar attribute values interpreted with this option include the CF Conventions values 'gregorian' or 'standard', 'proleptic_gregorian', 'noleap' or '365_day', 'all_leap' or '366_day', '360_day', and 'julian'.
-i
Same as the '-t' option, except output time data as date-time strings with ISO-8601 standard 'T' separator, instead of a blank.
-g grp1,...
The output will include data values only for the specified groups. One or more groups must be specified by name in the comma-delimited list following this option. The list must be a single argument to the command. The named groups must be valid netCDF groups in the input-file. The default, without this option and in the absence of the '-c' or '-h' options, is to include data values for all groups in the output.
-w
For file names that request remote access using DAP URLs, access data with client-side caching of entire variables.
-x
Output XML (NcML) instead of CDL. The NcML does not include data values. The NcML output option currently only works for netCDF classic model data.

ncdump examples

Look at the structure of the data in the netCDF file foo.nc:

ncdump -c foo.nc

Produce an annotated CDL version of the structure and data in the netCDF file foo.nc, using C-style indexing for the annotations:

ncdump -b c foo.nc > foo.cdl

Output data for only the variables uwind and vwind from the netCDF file foo.nc, and show the floating-point data with only three significant digits of precision:

ncdump -v uwind,vwind -p 3 foo.nc

Produce a fully-annotated (one data value per line) listing of the data for the variable omega, using FORTRAN conventions for indices, and changing the netCDF file name in the resulting CDL file to omega:

ncdump -v omega -f fortran -n omega foo.nc > Z.cdl

Examine the translated DDS for the DAP source from the specified URL:

ncdump -h http://test.opendap.org:8080/dods/dts/test.01

Without dumping all the data, show the special virtual attributes that indicate performance-related characterisitics of a netCDF-4 file:

ncdump -h -s nc4file.nc

SEE ALSO

ncgen(1), netcdf(3)

NOTE ON STRING OUTPUT

For classic, 64-bit offset or netCDF-4 classic model data, ncdump generates line breaks after embedded newlines in displaying character data. This is not done for netCDF-4 files, because netCDF-4 supports arrays of real strings of varying length.

nccopy

Copy a netCDF file, optionally changing format, compression, or chunking in the output.

nccopy synopsis

nccopy [-k kind_name] [-kind_code] [-d n] [-s] [-c chunkspec] [-u] [-w]
[-[v|V] var1,...] [-[g|G] grp1,...] [-m bufsize] [-h chunk_cache]
[-e cache_elems] [-r] infile outfile

nccopy description

The nccopy utility copies an input netCDF file in any supported format variant to an output netCDF file, optionally converting the output to any compatible netCDF format variant, compressing the data, or rechunking the data. For example, if built with the netCDF-3 library, a netCDF classic file may be copied to a netCDF 64-bit offset file, permitting larger variables. If built with the netCDF-4 library, a netCDF classic file may be copied to a netCDF-4 file or to a netCDF-4 classic model file as well, permitting data compression, efficient schema changes, larger variable sizes, and use of other netCDF-4 features.

If no output format is specified, with either -k kind_name or -kind_code, then the output will use the same format as the input, unless the input is classic or 64-bit offset and either chunking or compression is specified, in which case the output will be netCDF-4 classic model format. Attempting some kinds of format conversion will result in an error, if the conversion is not possible. For example, an attempt to copy a netCDF-4 file that uses features of the enhanced model, such as groups or variable-length strings, to any of the other kinds of netCDF formats that use the classic model will result in an error.

nccopy also serves as an example of a generic netCDF-4 program, with its ability to read any valid netCDF file and handle nested groups, strings, and user-defined types, including arbitrarily nested compound types, variable-length types, and data of any valid netCDF-4 type.

If DAP support was enabled when nccopy was built, the file name may specify a DAP URL. This may be used to convert data on DAP servers to local netCDF files.

nccopy options

-k kind_name
Use format name to specify the kind of file to be created and, by inference, the data model (i.e. netcdf-3 (classic) or netcdf-4 (enhanced)). The possible arguments are:
'nc3' or 'classic' => netCDF classic format
'nc6' or '64-bit offset' => netCDF 64-bit format
'nc4' or 'netCDF-4' => netCDF-4 format (enhanced data model)
'nc7' or 'netCDF-4 classic model' => netCDF-4 classic model format
Note: The old format numbers '1', '2', '3', '4', equivalent to the format names 'nc3', 'nc6', 'nc4', or 'nc7' respectively, are also still accepted but deprecated, due to easy confusion between format numbers and format names.
-k kind_code
Use format numeric code (instead of format name) to specify the kind of file to be created and, by inference, the data model (i.e. netcdf-3 (classic) versus netcdf-4 (enhanced)). The numeric codes are:
3 => netcdf classic format
6 => netCDF 64-bit format
4 => netCDF-4 format (enhanced data model)
7 => netCDF-4 classic model format
The numeric code "7" is used because "7=3+4", specifying the format that uses the netCDF-3 data model for compatibility with the netCDF-4 storage format for performance. Credit is due to NCO for use of these numeric codes instead of the old and confusing format numbers.
-d n
For netCDF-4 output, including netCDF-4 classic model, specify deflation level (level of compression) for variable data output. 0 corresponds to no compression and 9 to maximum compression, with higher levels of compression requiring marginally more time to compress or uncompress than lower levels. Compression achieved may also depend on output chunking parameters. If this option is specified for a classic format or 64-bit offset format input file, it is not necessary to also specify that the output should be netCDF-4 classic model, as that will be the default. If this option is not specified and the input file has compressed variables, the compression will still be preserved in the output, using the same chunking as in the input by default.
Note that nccopy requires all variables to be compressed using the same compression level, but the API has no such restriction. With a program you can customize compression for each variable independently.
-s
For netCDF-4 output, including netCDF-4 classic model, specify shuffling of variable data bytes before compression or after decompression. Shuffling refers to interlacing of bytes in a chunk so that the first bytes of all values are contiguous in storage, followed by all the second bytes, and so on, which often improves compression. This option is ignored unless a non-zero deflation level is specified. Using -d0 to specify no deflation on input data that has been compressed and shuffled turns off both compression and shuffling in the output.
-u
Convert any unlimited size dimensions in the input to fixed size dimensions in the output. This can speed up variable-at-a-time access, but slow down record-at-a-time access to multiple variables along an unlimited dimension.
-w
Keep output in memory (as a diskless netCDF file) until output is closed, at which time output file is written to disk. This can greatly speedup operations such as converting unlimited dimension to fixed size (-u option), chunking, rechunking, or compressing the input. It requires that available memory is large enough to hold the output file. This option may provide a larger speedup than careful tuning of the -m, -h, or -e options, and it's certainly a lot simpler.
-c chunkspec
For netCDF-4 output, including netCDF-4 classic model, specify chunking (multidimensional tiling) for variable data in the output. This is useful to specify the units of disk access, compression, or other filters such as checksums. Changing the chunking in a netCDF file can also greatly speedup access, by choosing chunk shapes that are appropriate for the most common access patterns.
The chunkspec argument is a string of comma-separated associations, each specifying a dimension name, a '/' character, and optionally the corresponding chunk length for that dimension. No blanks should appear in the chunkspec string, except possibly escaped blanks that are part of a dimension name. A chunkspec names at least one dimension, and may omit dimensions which are not to be chunked or for which the default chunk length is desired. If a dimension name is followed by a '/' character but no subsequent chunk length, the actual dimension length is assumed. If copying a classic model file to a netCDF-4 output file and not naming all dimensions in the chunkspec, unnamed dimensions will also use the actual dimension length for the chunk length. An example of a chunkspec for variables that use 'm' and 'n' dimensions might be 'm/100,n/200' to specify 100 by 200 chunks. To see the chunking resulting from copying with a chunkspec, use the '-s' option of ncdump on the output file.
The chunkspec '/' that omits all dimension names and corresponding chunk lengths specifies that no chunking is to occur in the output, so can be used to unchunk all the chunked variables. To see the chunking resulting from copying with a chunkspec, use the '-s' option of ncdump on the output file.
As an I/O optimization, nccopy has a threshold for the minimum size of non-record variables that get chunked, currently 8192 bytes. In the future, use of this threshold and its size may be settable in an option.
Note that nccopy requires variables that share a dimension to also share the chunk size associated with that dimension, but the programming interface has no such restriction. If you need to customize chunking for variables independently, you will need to use the library API in a custom utility program.
-v var1,...
The output will include data values for the specified variables, in addition to the declarations of all dimensions, variables, and attributes. One or more variables must be specified by name in the comma-delimited list following this option. The list must be a single argument to the command, hence cannot contain unescaped blanks or other white space characters. The named variables must be valid netCDF variables in the input-file. A variable within a group in a netCDF-4 file may be specified with an absolute path name, such as '/GroupA/GroupA2/var'. Use of a relative path name such as 'var' or 'grp/var' specifies all matching variable names in the file. The default, without this option, is to include data values for all variables in the output.
-V var1,...
The output will include the specified variables only but all dimensions and global or group attributes. One or more variables must be specified by name in the comma-delimited list following this option. The list must be a single argument to the command, hence cannot contain unescaped blanks or other white space characters. The named variables must be valid netCDF variables in the input-file. A variable within a group in a netCDF-4 file may be specified with an absolute path name, such as '/GroupA/GroupA2/var'. Use of a relative path name such as 'var' or 'grp/var' specifies all matching variable names in the file. The default, without this option, is to include all variables in the output.
-g grp1,...
The output will include data values only for the specified groups. One or more groups must be specified by name in the comma-delimited list following this option. The list must be a single argument to the command. The named groups must be valid netCDF groups in the input-file. The default, without this option, is to include data values for all groups in the output.
-G grp1,...
The output will include only the specified groups. One or more groups must be specified by name in the comma-delimited list following this option. The list must be a single argument to the command. The named groups must be valid netCDF groups in the input-file. The default, without this option, is to include all groups in the output.
-m bufsize
An integer or floating-point number that specifies the size, in bytes, of the copy buffer used to copy large variables. A suffix of K, M, G, or T multiplies the copy buffer size by one thousand, million, billion, or trillion, respectively. The default is 5 Mbytes, but will be increased if necessary to hold at least one chunk of netCDF-4 chunked variables in the input file. You may want to specify a value larger than the default for copying large files over high latency networks. Using the '-w' option may provide better performance, if the output fits in memory.
-h chunk_cache
For netCDF-4 output, including netCDF-4 classic model, an integer or floating-point number that specifies the size in bytes of chunk cache allocated for each chunked variable. This is not a property of the file, but merely a performance tuning parameter for avoiding compressing or decompressing the same data multiple times while copying and changing chunk shapes. A suffix of K, M, G, or T multiplies the chunk cache size by one thousand, million, billion, or trillion, respectively. The default is 4.194304 Mbytes (or whatever was specified for the configure-time constant CHUNK_CACHE_SIZE when the netCDF library was built). Ideally, the nccopy utility should accept only one memory buffer size and divide it optimally between a copy buffer and chunk cache, but no general algorithm for computing the optimum chunk cache size has been implemented yet. Using the '-w' option may provide better performance, if the output fits in memory.
-e cache_elems
For netCDF-4 output, including netCDF-4 classic model, specifies number of chunks that the chunk cache can hold. A suffix of K, M, G, or T multiplies the number of chunks that can be held in the cache by one thousand, million, billion, or trillion, respectively. This is not a property of the file, but merely a performance tuning parameter for avoiding compressing or decompressing the same data multiple times while copying and changing chunk shapes. The default is 1009 (or whatever was specified for the configure-time constant CHUNK_CACHE_NELEMS when the netCDF library was built). Ideally, the nccopy utility should determine an optimum value for this parameter, but no general algorithm for computing the optimum number of chunk cache elements has been implemented yet.
-r
Read netCDF classic or 64-bit offset input file into a diskless netCDF file in memory before copying. Requires that input file be small enough to fit into memory. For nccopy, this doesn't seem to provide any significant speedup, so may not be a useful option.

nccopy examples

Simple Copy

Make a copy of foo1.nc, a netCDF file of any type, to foo2.nc, a netCDF file of the same type:

nccopy foo1.nc foo2.nc

Note that the above copy will not be as fast as use of cp or other simple copy utility, because the file is copied using only the netCDF API. If the input file has extra bytes after the end of the netCDF data, those will not be copied, because they are not accessible through the netCDF interface. If the original file was generated in 'No fill' mode so that fill values are not stored for padding for data alignment, the output file may have different padding bytes.

Uncompress Data

Convert a netCDF-4 classic model file, compressed.nc, that uses compression, to a netCDF-3 file classic.nc:

nccopy -k classic compressed.nc classic.nc

Note that 'nc3' could be used instead of 'classic'.

Remote Access to Data Subset

Download the variable 'time_bnds' and its associated attributes from an OPeNDAP server and copy the result to a netCDF file named 'tb.nc':

nccopy 'http://test.opendap.org/opendap/data/nc/sst.mnmean.nc.gz?time_bnds' tb.nc

Note that URLs that name specific variables as command-line arguments should generally be quoted, to avoid the shell interpreting special characters such as '?'.

Compress Data

Compress all the variables in the input file foo.nc, a netCDF file of any type, to the output file bar.nc:

nccopy -d1 foo.nc bar.nc

If foo.nc was a classic or 64-bit offset netCDF file, bar.nc will be a netCDF-4 classic model netCDF file, because the classic and 64-bit offset format variants don't support compression. If foo.nc was a netCDF-4 file with some variables compressed using various deflation levels, the output will also be a netCDF-4 file of the same type, but all the variables, including any uncompressed variables in the input, will now use deflation level 1.

Rechunk Data for Faster Access

Assume the input data includes gridded variables that use time, lat, lon dimensions, with 1000 times by 1000 latitudes by 1000 longitudes, and that the time dimension varies most slowly. Also assume that users want quick access to data at all times for a small set of lat-lon points. Accessing data for 1000 times would typically require accessing 1000 disk blocks, which may be slow.

Reorganizing the data into chunks on disk that have all the time in each chunk for a few lat and lon coordinates would greatly speed up such access. To chunk the data in the input file slow.nc, a netCDF file of any type, to the output file fast.nc, you could use;

nccopy -c time/1000,lat/40,lon/40 slow.nc fast.nc

to specify data chunks of 1000 times, 40 latitudes, and 40 longitudes. If you had enough memory to contain the output file, you could speed up the rechunking operation significantly by creating the output in memory before writing it to disk on close:

nccopy -w -c time/1000,lat/40,lon/40 slow.nc fast.nc

SEE ALSO

ncdump(1), ncgen(1), netcdf(3)

ncgen

The ncgen tool generates a netCDF file or a C or FORTRAN program that creates a netCDF dataset. If no options are specified in invoking ncgen, the program merely checks the syntax of the CDL input, producing error messages for any violations of CDL syntax.

The ncgen tool is now is capable of producing netcdf-4 files. It operates essentially identically to the original ncgen.

The CDL input to ncgen may include data model constructs from the netcdf- data model. In particular, it includes new primitive types such as unsigned integers and strings, opaque data, enumerations, and user-defined constructs using vlen and compound types. The ncgen man page should be consulted for more detailed information.

UNIX syntax for invoking ncgen:

ncgen [-b] [-o netcdf-file] [-c] [-f] [-k<kind>] [-l<language>] [-x] [input-file]

where:

-b
Create a (binary) netCDF file. If the '-o' option is absent, a default
file name will be constructed from the netCDF name (specified after
the netcdf keyword in the input) by appending the '.nc'
extension. Warning: if a file already exists with the specified name
it will be overwritten.
-o netcdf-file
Name for the netCDF file created. If this option is specified, it
implies the '-b' option. (This option is necessary because netCDF
files are direct-access files created with seek calls, and hence
cannot be written to standard output.)
-c
Generate C source code that will create a netCDF dataset matching the
netCDF specification. The C source code is written to standard
output. This is only useful for relatively small CDL files, since all
the data is included in variable initializations in the generated
program. The -c flag is deprecated and the -lc flag should be used
intstead.
-f
Generate FORTRAN source code that will create a netCDF dataset
matching the netCDF specification. The FORTRAN source code is written
to standard output. This is only useful for relatively small CDL
files, since all the data is included in variable initializations in
the generated program. The -f flag is deprecated and the -lf77 flag
should be used intstead.
-k
The -k file specifies the kind of netCDF file to generate. The
arguments to the -k flag can be as follows.
        'classic', 'nc3' – Produce a netcdf classic file format file.
        '64-bit offset', 'nc6' – Produce a netcdf 64 bit classic file format file.
        '64-bit data (CDF-5), 'nc5' – Produce a CDF-5 format file.
        'netCDF-4', 'nc4' – Produce a netcdf-4 format file.
        'netCDF-4 classic model', 'nc7' – Produce a netcdf-4 file format, but restricted to netcdf-3 classic CDL input.
Note that the -v flag is a deprecated alias for -k. The code 'nc7' is
used as a short form for the unwieldy 'netCDF-4 classic model' because
7=3+4, a mnemonic for the format that uses the netCDF-3 data model for
compatibility with the netCDF-4 storage format for performance. The
old version format numbers '1', '2', '3', '4', equivalent to the
format names 'nc3', 'nc6', 'nc4', or 'nc7' respectively, are also
still accepted but deprecated, due to easy confusion between format
numbers and format names. Various old format name aliases are also
accepted but deprecated, e.g. 'hdf5', 'enhanced-nc3', for 'netCDF-4'.
-l
The -l file specifies that ncgen should output (to standard output)
the text of a program that, when compiled and executed, will produce
the corresponding binary .nc file. The arguments to the -l flag can be
as follows.
        c|C => C language output.
        f77|fortran77 => FORTRAN 77 language output; note that currently only the classic model is supported for fortran output.
-x
Use “no fill” mode, omitting the initialization of variable values
with fill values. This can make the creation of large files much
faster, but it will also eliminate the possibility of detecting the
inadvertent reading of values that haven't been written.

Examples

Check the syntax of the CDL file foo.cdl:

ncgen foo.cdl

From the CDL file foo.cdl, generate an equivalent binary netCDF file named bar.nc:

ncgen -o bar.nc foo.cdl

From the CDL file foo.cdl, generate a C program containing netCDF function invocations that will create an equivalent binary netCDF dataset:

ncgen -l c foo.cdl > foo.c

ncgen3

The ncgen3 tool is the new name for the older, original ncgen utility.

The ncgen3 tool generates a netCDF file or a C or FORTRAN program that creates a netCDF dataset. If no options are specified in invoking ncgen3, the program merely checks the syntax of the CDL input, producing error messages for any violations of CDL syntax.

The ncgen3 utility can only generate classic-model netCDF-4 files or programs.

UNIX syntax for invoking ncgen3:

ncgen3 [-b] [-o netcdf-file] [-c] [-f] [-v2|-v3|-v5] [-x] [input-file]

where:

-b
Create a (binary) netCDF file. If the '-o' option is absent, a default
file name will be constructed from the netCDF name (specified after
the netcdf keyword in the input) by appending the '.nc'
extension. Warning: if a file already exists with the specified name
it will be overwritten.
-o netcdf-file
Name for the netCDF file created. If this option is specified, it
implies the '-b' option. (This option is necessary because netCDF
files are direct-access files created with seek calls, and hence
cannot be written to standard output.)
-c
Generate C source code that will create a netCDF dataset matching the
netCDF specification. The C source code is written to standard
output. This is only useful for relatively small CDL files, since all
the data is included in variable initializations in the generated
program.
-f
Generate FORTRAN source code that will create a netCDF dataset
matching the netCDF specification. The FORTRAN source code is written
to standard output. This is only useful for relatively small CDL
files, since all the data is included in variable initializations in
the generated program.
-v2
The generated netCDF file or program will use the version of the
format with 64-bit offsets, to allow for the creation of very large
files. These files are not as portable as classic format netCDF files,
because they require version 3.6.0 or later of the netCDF library.
-v3
The generated netCDF file will be in netCDF-4/HDF5 format. These files
are not as portable as classic format netCDF files, because they
require version 4.0 or later of the netCDF library.
-v5
The generated netCDF file or program will use the version of the
format with 64-bit integers, to allow for the creation of very large
variables. These files are not as portable as classic format netCDF files,
because they require version 4.4.0 or later of the netCDF library.
-x
Use “no fill” mode, omitting the initialization of variable values
with fill values. This can make the creation of large files much
faster, but it will also eliminate the possibility of detecting the
inadvertent reading of values that haven't been written.

Return to the Main Unidata NetCDF page.
Generated on Mon Nov 21 2016 11:24:52 for NetCDF. NetCDF is a Unidata library.