A netCDF standard for climate data

Jonathan Gregory (jmgregory@meto.gov.uk)
Mon, 15 Jun 1998 16:38:25 +0100 (BST)

Jonathan Gregory (Hadley Centre), Bob Drach (PCMDI) and Simon Tett (Hadley
Centre) have developed a set of netCDF conventions appropriate for climate
data, especially data from GCMs.  This standard is a revised version of our
proposal of last June, and has now been registered with Unidata under the name
"GDT". We've made a few changes, outlined below, following suggestions on this
newsgroup and from other individuals; we are grateful to those who have taken
the time to think about our proposals, and welcome further comments now.

The basic principles and structure of the conventions are unaltered from our
original proposal.  Although this is specifically a netCDF standard, we feel
that most of the ideas are of wider application. Our main purpose is to propose
a clear, adequate and flexible definition of the metadata needed for climate
data. The metadata objects could be contained in file formats other than
netCDF. Interconversion of the metadata between files of different formats will
be facilitated if they are based on similar ideas.

As before (in fact, more so than before), netCDF data which adheres to our
standard would be compatible with COARDS in almost all respects.  It is likely
that the PCMDI LATS software (used by AMIP and CMIP) will produce netCDF output
compliant with this standard in the not-too-distant future.

You can access our conventions document via the Unidata netCDF conventions
page http://www.unidata.ucar.edu/software/netcdf/conventions.html or directly
from PCMDI as http://www-pcmdi.llnl.gov/drach/GDT_convention.html or the UK
Met Office as http://www.met-office.gov.uk/sec5/CR_div/GDT_convention.html.

Jonathan Gregory


The main differences between the current conventions document (GDT 1.1) and our
proposal of last summer are these:

* Incorporation of a large number of CDL examples, as many people suggested.
These make the document much bulkier, but hopefully will clarify it.

* A new calendar-independent representation of time. This replaces our earlier
proposal of fixed-length months with one which should be easier to handle, as
it corresponds more directly to the normal "components" of time viz year,
month, day etc. For instance, the straightforward encoding of a date as a
number YYYYMMDD is one of the formats supported by the convention.

* A revised representation of climatological time, making use of the
above. This retains the same idea as before, that climatological time involves
"collapsing" some parts of the time variation (year, seasonal cycle or diurnal
cycle); thus, as before, there are multiple time axes. In the old proposal,
these had to be "added together" in some way, but now they are quite
independent and the interpretation should be fairly obvious.

* We keep our general notion of contracting or collapsing axes (e.g. the
longitude axis in a zonal mean). The method of contraction is now recorded on
the data variable instead of the coordinate variable. This is because in some
cases the order of contraction is important. New optional attributes are
proposed to record more information about what the coordinates were before the
contraction.

* Following the lengthy discussions on the netCDF news group, we have
generalised associated and boundary coordinates so that they can be
multidimensional. This allows for alternative sets of coordinates to be
recorded for a data variable.

* In the original proposal, we suggested a quantity attribute to contain a
description of a variable (longitude, latitude, temperature, pressure etc.)  to
be chosen from a standard list. We think that standardising the descriptions is
important for making data portable. However, the quantity attribute would often
contain similar information to the long_name, so in the new version we propose
instead that the long_name attribute itself should be chosen from a standard
list. To make this practical, we hope to set in place a partially automated way
of requesting that new long_names be defined. There is also provision for using
a non-standard long_name.

* We propose a new attribute of the data variable which indicates which are the
longitude, latitude, vertical and time dimensions. If present, this attribute
will be an easy and reliable way of finding these dimensions, which often have
special importance to applications.