2.1 The NetCDF Data Model

A netCDF dataset contains dimensions, variables, and attributes, which all have both a name and an ID number by which they are identified. These components can be used together to capture the meaning of data and relations among data fields in an array-oriented dataset. The netCDF library allows simultaneous access to multiple netCDF datasets which are identified by dataset ID numbers, in addition to ordinary file names.

2.1.1 Expanded Model in NetCDF-4 Files

Files created with the netCDF-4 format have access to an expanded data model, which includes named groups. Groups, like directories in a Unix file system, are hierarchically organized, to arbitrary depth. They can be used to organize large numbers of variables.

Each group acts as an entire netCDF dataset in the classic model. That is, each group may have attributes, dimensions, and variables, as well as other groups.

The default root is the root group, which allows the classic netCDF data model to fit neatly into the new model.

Dimensions are scoped such that they can be seen in all descendent groups. That is, dimensions can be shared between variables in different groups, if they are defined in a parent group.

2.1.2 Naming Conventions

The names of dimensions, variables and attributes (and, in netCDF-4 files, groups) consist of arbitrary sequences of alphanumeric characters (as well as underscore '_', period '.' and hyphen '-'), beginning with a letter or underscore. (However names commencing with underscore are reserved for system use.) Case is significant in netCDF names. A zero-length name is not allowed.

2.1.3 Network Common Data Form Language (CDL)

We will use a small netCDF example to illustrate the concepts of the netCDF data model. This includes dimensions, variables, and attributes. The notation used to describe this simple netCDF object is called CDL (network Common Data form Language), which provides a convenient way of describing netCDF datasets. The netCDF system includes utilities for producing human-oriented CDL text files from binary netCDF datasets and vice versa.

CDL has not yet been expanded to accommodate netCDF-4. We plan to add features like groups, compound types, and other new type in a future release.

     netcdf example_1 {  // example of CDL notation for a netCDF dataset
     
     dimensions:         // dimension names and lengths are declared first
             lat = 5, lon = 10, level = 4, time = unlimited;
     
     variables:          // variable types, names, shapes, attributes
             float   temp(time,level,lat,lon);
                         temp:long_name     = "temperature";
                         temp:units         = "celsius";
             float   rh(time,lat,lon);
                         rh:long_name = "relative humidity";
                         rh:valid_range = 0.0, 1.0;      // min and max
             int     lat(lat), lon(lon), level(level);
                         lat:units       = "degrees_north";
                         lon:units       = "degrees_east";
                         level:units     = "millibars";
             short   time(time);
                         time:units      = "hours since 1996-1-1";
             // global attributes
                         :source = "Fictional Model Output";
     
     data:                // optional data assignments
             level   = 1000, 850, 700, 500;
             lat     = 20, 30, 40, 50, 60;
             lon     = -160,-140,-118,-96,-84,-52,-45,-35,-25,-15;
             time    = 12;
             rh      =.5,.2,.4,.2,.3,.2,.4,.5,.6,.7,
                      .1,.3,.1,.1,.1,.1,.5,.7,.8,.8,
                      .1,.2,.2,.2,.2,.5,.7,.8,.9,.9,
                      .1,.2,.3,.3,.3,.3,.7,.8,.9,.9,
                       0,.1,.2,.4,.4,.4,.4,.7,.9,.9;
     }

The CDL notation for a netCDF dataset can be generated automatically by using ncdump, a utility program described later (see ncdump). Another netCDF utility, ncgen, generates a netCDF dataset (or optionally C or FORTRAN source code containing calls needed to produce a netCDF dataset) from CDL input (see ncgen).

The CDL notation is simple and largely self-explanatory. It will be explained more fully as we describe the components of a netCDF dataset. For now, note that CDL statements are terminated by a semicolon. Spaces, tabs, and newlines can be used freely for readability. Comments in CDL follow the characters '//' on any line. A CDL description of a netCDF dataset takes the form

       netCDF name {
         dimensions: ...
         variables: ...
         data: ...
       }

where the name is used only as a default in constructing file names by the ncgen utility. The CDL description consists of three optional parts, introduced by the keywords dimensions, variables, and data. NetCDF dimension declarations appear after the dimensions keyword, netCDF variables and attributes are defined after the variables keyword, and variable data assignments appear after the data keyword.

The ncgen utility provides a command line option which indicates the desired output format. Limitations are enforced for the selected format - that is, some cdl files may be expressible only in 64-bit offset or NetCDF-4 format.

For example, trying to create a file with very large variables in classic format may result in an error because size limits are violated.