NcML: The Key Components

Ben Domenico
Draft last modified: December 3, 2004

NcML is an XML dialect used to describe individual netCDF files or aggregations of such files. At present, it has three systematically defined elements and and addtional element that can be used for additional metadata.

  Sub elements Contains Examples Additional Information
NcML Core  

XML representation of metadata in an existing netCDF file or class of files:

  • variables
  • dimensions
  • attributes

In essence, this covers everything in the netCDF file except the data values themselves.

 

global attributes:

  • title
  • description

dimensions:

  • time specified as unlimited
  • lat of specified length with attribute units "degrees_north"
  • lon of specified length with attribute units "degrees_east"

variables:

  • T with attribute long name "temperature"
  • RH with attribute long name "relative humidity"

XML example

For existing netCDF files, this information can be extracted automatically from the file itself.
 
NCML Coordinate System   Detailed information about the general or georeferenced coordinate system being used.

Many netCDF files -- especially those that conform to well-defined conventions have information about the coordinate system of the dataset. NCML-CS is a means of specifying that information in XML.

XML example for netCDF conforming to CF-1 conventions

 
 
NcML Dataset Subsetting Pointers to subsets of data withing a netCDF file or dataset
  • One of many variables in a netCDF file (e.g., pressure)
  • Subset of index ranges for several variables (might correspond to geographic region)
  • One forecast time in a model output netCDF
 
Aggregation Pointers to a set of netCDF files that comprise a netCDF dataset which can be thought of as a virtual netCDF file.
  • Collection of netCDF files associated with a specific event
 
Combination Pointers to subets of a collection of existing netCDF files that comprise a netCDF dataset.
  • Time series consisting of the valused of one variable from a large number of files representing different times.
 
 
Enhanced Metadata   Can be any additional information someone thinks is valuable in describing the dataset

A simple example might be a brief description of the class of dataset (for example WRF model output) and/or a pointer to a document describing the class of data.

In a more sophisticated setting, a data mining engine could scan a collection of netCDF files to determine the max and min values for all the variables and dimensions, for example. This information could be included in the NcML as enhanced metadata for each file or for a collection of files in a dataset.

A more sophisticated engine might use the coordinate system information to determine the dimension subsets that correspond to a geographical region and then determine the max and min variable values for that region.