Unidata - To provide the data services, tools, and cyberinfrastructure leadership that advance Earth system science, enhance educational opportunities, and broaden participation. Unidata
         
  advanced  
 

Annotated Schema for NcML-2.2


An NcML document is an XML document (aka an instance document) whose contents are described and constrained by NcML Schema-2.2. NcML Schema-2.2 combines the earlier NcML core schema which is an XML description of the netCDF-Java 2.2 / NetCDF-4 data model, with the earlier NcML dataset schema, which allows you to define, redefine, aggregate, and subset existing netCDF files.

An NcML document represents a generic netCDF dataset, i.e. a container for data conforming to the netCDF data model. For instance, it might represent an existing netCDF file, a netCDF file not yet written, a GRIB file read through the netCDF-Java library, a subset of a netCDF file, an aggregation of netCDF files, or a self-contained dataset (i.e. all the data is included in the NcML document and there is no seperate netCDF file holding the data). An NcML document therefore should not necessarily be thought of as a physical netCDF file, but rather the "public interface" to a set of data conforming to the netCDF data model.

NcML Schema-2.2 is written in the W3C XML Schema language, and essentially represents the netCDF-Java 2.2 / NetCDF-4 data model, which schematically looks like this in UML:

 

netCDF-Java 2.2 / NetCDF-4 UML

 

Annotated Schema

Aggregation specific elements are listed in red.

schema Element

<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema targetNamespace="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"
elementFormDefault="qualified">

netcdf Element

The netcdf element represents a generic netCDF dataset, i.e. a container for data conforming to the netCDF data model. For instance, a netcdf element might represent an existing netCDF file, a netCDF file not yet written, a GRIB file read through the netCDF-Java library, a subset of a netCDF file, an aggregation of netCDF files, or a self-contained dataset (i.e. all the data is included in the NcML document and there is no seperate netCDF file holding the data). An NcML document therefore should not necessarily be thought of as a physical netCDF file, but rather the "public interface" or API to a set of data, which may or may not be implemented with a physical netCDF file.

The element netcdf is the root tag of the NcML instance document, and is said to define a NetCDF dataset.

<!-- XML encoding of Netcdf container object -->
<xsd:element name="netcdf">
<xsd:complexType>
<xsd:sequence> (1) <xsd:choice minOccurs="0">
<xsd:element name="readMetadata"/>
<xsd:element name="explicit"/>
</xsd:choice>
(2) <xsd:choice minOccurs="0" maxOccurs="unbounded">
<xsd:element ref="group"/> <xsd:element ref="dimension"/>
<xsd:element ref="variable"/>
<xsd:element ref="attribute"/>
<xsd:element ref="remove"/>
</xsd:choice>
(3) <xsd:element ref="aggregation" minOccurs="0"/>
</xsd:sequence>
 (4)<xsd:attribute name="location" type="xsd:anyURI"/>
 (5)<xsd:attribute name="id" type="xsd:string"/>
 (6)<xsd:attribute name="title" type="xsd:string"/>
 (7)<xsd:attribute name="enhance" type="xsd:boolean"/>
 (8)<xsd:attribute name="addRecords" type="xsd:boolean"/>
    <!-- for aggregations -->
 (9)<xsd:attribute name="ncoords" type="xsd:string"/>
(10)<xsd:attribute name="coordValue" type="xsd:string"/>
(11)<xsd:attribute name="fmrcDefinition" type="xsd:string"/>
  </xsd:complexType>
</xsd:element>
  1. A readMetadata (default) or an explicit element comes first. The readMetadata element indicates that all the metadata from the referenced dataset will be read in. The explicit element indicates that only the metadata explictly declared in the NcML file will be used.
  2. The netcdf element may contain any number (including 0) of elements group, variable, dimension, or attribute that can appear in any order. If you use readMetadata, you can remove specific elements with the remove element.
  3. An aggregation element is used logically join multiple netcdf datasets into a single dataset.
  4. The optional location attribute provides a reference to another netCDF dataset, called the referenced dataset. The location can be an absolute URL (eg http://server/myfile, or file:/usr/local/data/mine.nc) or a URL reletive to the NcML location (eg subdir/mydata.nc). The referenced dataset contains the variable data that is not explicitly specified in the NcML document itself. If the location is missing and the data is not defined in values elements, then an empty file is written similar to the way CDL files are written by ncgen.
  5. The optional id attribute is meant to provide a way to uniquely identify (relative to the application context) the NetCDF dataset. It is important to understand that the id attribute refers to the NetCDF dataset defined by the XML instance document, NOT the referenced dataset if there is one.
  6. The optional title attribute provides a way to add a human readable title to the netCDF dataset.
  7. The optional enhance attribute indicates whether the referenced dataset is opened in enhanced mode or not (default not).
  8. The optional addRecords attribute is used only when the referenced datasets is a netCDF-3 file. If true (default false) then a structure variable containing the record variables is added.
  9. The optional ncoords attribute is used for joinExisting aggregation datasets to indicate the number of coordinates that come from the dataset.
  10. The coordValue attribute is used for aggregation datasets to assign a coordinate value(s) to the dataset. If there are multiple coordinates for a single file, blanks and/or commas are used to delineate them, so you cannot use those characters in your coordinate values.
  11. The fmrcDefinition attribute is used for aggregation datasets to pass in an fmrcDefinition file to the Grib IOSP layer.

group Element

A group element represents a netCDF group, a container for variable, dimension, attribute, or other group elements.

  <xsd:element name="group">
<xsd:complexType>
(1) <xsd:choice minOccurs="0" maxOccurs="unbounded">
<xsd:element ref="dimension"/>
<xsd:element ref="variable"/>
<xsd:element ref="attribute"/>
<xsd:element ref="group"/>
<xsd:element ref="remove"/>
</xsd:choice>
(2) <xsd:attribute name="name" type="xsd:string" use="required"/>
(3) <xsd:attribute name="orgName" type="xsd:string"/>
</xsd:complexType>
</xsd:element>
  1. The group element may contain any number (including 0) of elements group, variable, dimension, or attribute that can appear in any order. You can also mix in remove elements to remove elements.
  2. The mandatory name attribute must be unique among groups within its containing group or netcdf element.
  3. The optional attribute orgName is used when renaming a group.

dimension Element

The dimension element represents a netCDF dimension, i.e. a named index of specified length.

  <!-- XML encoding of Dimension object -->
  <xsd:element name="dimension">
<xsd:complexType>
(1) <xsd:attribute name="name" type="xsd:token" use="required"/>
(2) <xsd:attribute name="length" type="xsd:nonNegativeInteger"/>
(3) <xsd:attribute name="isUnlimited" type="xsd:boolean" default="false"/>
(4) <xsd:attribute name="isVariableLength" type="xsd:boolean" default="false"/>
(5) <xsd:attribute name="isShared" type="xsd:boolean" default="true"/>
(6) <xsd:attribute name="orgName" type="xsd:string"/>
</xsd:complexType>
</xsd:element>
  1. The mandatory name attribute must be unique among dimensions within its containing group or netcdf element.
  2. The mandatory attribute length expresses the cardinality (number of points) associated with the dimension. Its value can be any non negative integer including 0 (since the unlimited dimension in a netCDF file may have length 0, corresponding to 0 records).
  3. The attribute isUnlimited is true only if the dimension can grow (a.k.a the record dimension in NetCDF-3 files), and false when the length is fixed at file creation.
  4. The attribute isVariableLength is used for variable length data types, where the length is not part of the metadata..
  5. The attribute isShared is true for shared dimensions, and false when the dimension is private to the variable.
  6. The optional attribute orgName is used when renaming a dimension.

variable Element

A variable element represents a netCDF variable, i.e. a scalar or multidimensional array of specified type indexed by 0 or more dimensions.

  <xsd:element name="variable">
<xsd:complexType>
<xsd:sequence>
(1) <xsd:element ref="attribute" minOccurs="0" maxOccurs="unbounded"/>
(2) <xsd:element ref="values" minOccurs="0"/>
(3) <xsd:element ref="variable" minOccurs="0" maxOccurs="unbounded"/>
(4) <xsd:element ref="logicalView" minOccurs="0" />
(5) <xsd:element ref="remove" minOccurs="0" maxOccurs="unbounded" />
</xsd:sequence>
(6) <xsd:attribute name="name" type="xsd:token" use="required" />
(7) <xsd:attribute name="type" type="DataType" use="required" />
(8) <xsd:attribute name="shape" type="xsd:token" />
(9) <xsd:attribute name="orgName" type="xsd:string"/>
</xsd:complexType>
</xsd:element>
  1. A variable element may contain 0 or more attribute elements,
  2. The optional values element is used to specify the data values of the variable. The values must be listed compatibly with the size and shape of the variable (slowest varying dimension first). If not specified, the data values are taken from the variable of the same name in the referenced dataset.
  3. A variable of data type structure may have nested variable elements within it.
  4. NOT IMPLEMENTED YET
  5. You can remove attributes from the underlying variable.
  6. The mandatory name attribute must be unique among variables within its containing group, variable, or netcdf element.
  7. The mandatory type attribute is one of the enumerated DataTypes.
  8. The shape attribute lists the names of the dimensions the variable depends on. For a scalar variable, the list will be empty. The dimension names must be ordered with the slowest varying dimension first (same as in the CDL description). For backwards compatibility, scalar variables may omit this attribute, although this is deprecated.
  9. The optional attribute orgName is used when renaming a variable. .

values Element

A values element specifies the data values of a variable, either by listing them for example: <values>-109.0 -107.0 -115.0 93.923230</values> or by specifying a start and increment, for example: <values start="-109.5" increment="2.0" />. For a multi-dimensional variable, the values must be listed compatibly with the size and shape of the variable (slowest varying dimension first).

  <xsd:element name="values">
<xsd:complexType mixed="true">
(1) <xsd:attribute name="start" type="xsd:float"/>
<xsd:attribute name="increment" type="xsd:float"/>
<xsd:attribute name="npts" type="xsd:int"/>
(2) <xsd:attribute name="separator" type="xsd:string" />
</xsd:complexType>
</xsd:element>
  1. The values can be specified with a start and increment attributes, if they are numeric and evenly spaced. You can enter these as integers or floating point numbers, and they will be converted to the data type of the variable. The number of points will be taken from the shape of the variable. (For backwards compatibility, an npts attribute is allowed, although this is deprecated and ignored).
  2. By default, the list of values are separated by whitespace but a different token can be specified using the separator attribute. This is useful if you are entering String values, e.g. <values separator="*">My dog*has*fleas</values> defines three Strings.

attribute Element

The attribute elements represents a netCDF attribute, i.e. a name-value pair of specified type. Its value may be specified in the values attribute e.g. <attribute name="Conventions" value="COARDS"/>, or in the element content, , e.g. <attribute name="valid_range" type="float">-99.0 110</values>. It is recommended to use the second form when there are multiple values, but this is only a matter of style.

  <xsd:element name="attribute">
<xsd:complexType mixed="true">
(1) <xsd:attribute name="name" type="xsd:token" use="required"/>
(2) <xsd:attribute name="type" type="DataType" default="String"/>
(3) <xsd:attribute name="value" type="xsd:string" />
(4) <xsd:attribute name="separator" type="xsd:string" />
(5) <xsd:attribute name="orgName" type="xsd:string"/>
</xsd:complexType>
</xsd:element>
  1. The mandatory name attribute must be unique among attributes within its containing group, variable, or netcdf element.
  2. The type attribute is one of the enumerated DataTypes. It defaults to a String type.
  3. The value attribute contains the actual data of the attribute element. In the most common case of single-valued attributes, a single number or string will be listed (as in value="3.0"), while in the less frequent case of multi-valued attributes, all the numbers will be listed and separated by a blank or optionally some other character (as in value="3.0 4.0 5.0"). Values can also be specified in the elemnt content.
  4. By default, if the attribute has type String, the entire value is taken as the attribute value, and if it has type other than String, then the list of values are separated by whitespace. A different token seperator can be specified using the separator attribute
  5. The optional attribute orgName is used when renaming an attribute. .

DataType Element

The DataType element is an enumerated list of the data types allowed for NcML Variable and Attribute objects.

  <xsd:simpleType name="DataType">
<xsd:restriction base="xsd:token">
<xsd:enumeration value="char"/>
<xsd:enumeration value="byte"/>
<xsd:enumeration value="short"/>
<xsd:enumeration value="int"/>
<xsd:enumeration value="long"/>
<xsd:enumeration value="float"/>
<xsd:enumeration value="double"/>
<xsd:enumeration value="String"/>
<xsd:enumeration value="string"/>
<xsd:enumeration value="Structure"/>
</xsd:restriction>
</xsd:simpleType>
  1. Enumeration, Opaque not yet implemented.
  2. Unsigned types not yet implemented.

remove Element

The remove element is used to remove attribute, dimension, variable or group objects that are in the referenced dataset. Place the remove element in the container of the object to be removed.

  <xsd:element name="remove">
<xsd:complexType>
(1) <xsd:attribute name="name" type="xsd:string" use="required"/> (2) <xsd:attribute name="type" type="ObjectType" use="required"/>
</xsd:complexType>
</xsd:element>
 <xsd:simpleType name="ObjectType">
   <xsd:restriction base="xsd:string">
     <xsd:enumeration value="attribute"/>
     <xsd:enumeration value="dimension"/>
     <xsd:enumeration value="variable"/>
     <xsd:enumeration value="group"/>
   </xsd:restriction>
 </xsd:simpleType>
  1. The name of the object to remove
  2. The type of the object to remove: attribute, dimension, variable or group.

aggregation Element

The aggregation element allows multiple datasets to be combined into a single logical dataset. There can only be one aggregation element in a netcdf element.

<xsd:element name="aggregation">
<xsd:complexType>
<xsd:sequence> (1) <xsd:element name="variableAgg" minOccurs="0" maxOccurs="unbounded"> <xsd:complexType> <xsd:attribute name="name" type="xsd:string" use="required"/> </xsd:complexType> </xsd:element>
(2)  <xsd:element ref="netcdf" minOccurs="0" maxOccurs="unbounded"/>
(3)  <xsd:element name="scan" minOccurs="0" maxOccurs="unbounded">
      <xsd:complexType>
(4)    <xsd:attribute name="location" type="xsd:string" use="required"/>
(5)    <xsd:attribute name="suffix" type="xsd:string" />
(6)    <xsd:attribute name="regExp" type="xsd:string" />
(7)    <xsd:attribute name="subdirs" type="xsd:boolean" default="true"/>
(8)    <xsd:attribute name="olderThan" type="xsd:string" />

(9)    <xsd:attribute name="dateFormatMark" type="xsd:string" />
(10)   <xsd:attribute name="enhance" type="xsd:boolean"/>

</xsd:complexType> </xsd:element> (11) <xsd:element name="scanFmrc" minOccurs="0" maxOccurs="unbounded"> <xsd:complexType> (4) <xsd:attribute name="location" type="xsd:string" use="required"/> (5) <xsd:attribute name="suffix" type="xsd:string" /> (6) <xsd:attribute name="regExp" type="xsd:string" /> (7) <xsd:attribute name="subdirs" type="xsd:boolean" default="true"/> (8) <xsd:attribute name="olderThan" type="xsd:string" /> (12) <xsd:attribute name="runDateMatcher" type="xsd:string" /> <xsd:attribute name="forecastDateMatcher" type="xsd:string" /> <xsd:attribute name="forecastOffsetMatcher" type="xsd:string" />
</xsd:complexType> </xsd:element>
    </xsd:sequence>
(13) <xsd:attribute name="type" type="AggregationType" use="required"/>
(14) <xsd:attribute name="dimName" type="xsd:token" />
(15) <xsd:attribute name="recheckEvery" type="xsd:string" />

  </xsd:complexType>
</xsd:element>
  1. For joinNew aggregation types, the variables that will be aggregated must be explicitly listed in a variableAgg element.
  2. The nested netcdf datasets can be explicitly listed.
  3. Or they can be implictly specified by naming a directory in a scan element.
  4. The directory pathname.
  5. If you specify a suffix, only files with that ending will be included.
  6. If you specify a regExp, only files with whose full pathnames match the regular expression will be included.
  7. You can optionally specify if the scan should descend into subdirectories (default true).
  8. If present, only files whose last modified date is older than this amount of time from the present will be included. This is a way to exclude files that are still being written. This must be a udunit time such as "5 min" or "1 hour".
  9. A dateFormatMark is used on joinNew types to create date coordinate values out of the filename. It consists of a section of text, a '#' marking character, then a java.text.SimpleDataFormat string. The number of characters before the # is skipped in the filename, then the next part of the filename must match the SimpleDataFormat string. You can ignore trailing text. For example:
            Filename: SUPER-NATIONAL_1km_SFC-T_20051206_2300.gini 
    DateFormatMark: SUPER-NATIONAL_1km_SFC-T_#yyyyMMdd_HHmm
    
  10. You can optionally specify that the files should be opened in enhanced mode (default false)
  11. A specialized scanFmrc element can be used for a forecastModelRunSingleCollection aggregation, where forecast model run data is stored in multiple files, with one forecast time per file.
  12. For scanFmrc, the run date and the forecast date is extracting from the file pathname using a runDateMatcher and either a forecastDateMatcher or a forecastOffsetMatcher attribute. All of these require matching a specific string in the file's pathname and then matching a date or hour offset immediately before or after the match. The match is specified by placing it between '#' marking characters. The runDateMatcher and forecastDateMatcher has a java.text.SimpleDataFormat string before or after the match, while a forecastOffsetMatcher counts the number of 'H' characters, and extracts an hour offset from the run date. For example:
         
                 Filename:  gfs_3_20060706_0300_006.grb 
           runDateMatcher: #gfs_3_#yyyyMMdd_HH
    forecastOffsetMatcher:                     HHH#.grb#

    will extract the run date 2006-07-06T03:00:00Z, and the forecast offset "6 hours".

  13. You must specify an aggregation type.
  14. For all types except joinUnion, you must specify the dimension name to join.
  15. When you are using scan elements on a set of files that may change, and you are using caching, set recheckEvery to a valid udunit time value, like "10 min", "1 hour", "30 days", etc. Whenever the dataset is reacquired from the cache, the directories will be rescanned if recheckEvery amount of time has elapsed since the last time it was scanned. If you do not specify a recheckEvery attribute, the collection will be assumed to be non-changing.
 <!-- type of aggregation -->
 <xsd:simpleType name="AggregationType">
  <xsd:restriction base="xsd:string">
   <xsd:enumeration value="forecastModelRunCollection"/>
<xsd:enumeration value="forecastModelRunSingleCollection"/> <xsd:enumeration value="joinExisting"/> <xsd:enumeration value="joinNew"/> <xsd:enumeration value="union"/> </xsd:restriction> </xsd:simpleType>

The allowable aggregation types.

Notes

 

The java.text.SimpleDateFormat

The following is taken from the javadoc, see here for full info. The following pattern letters are defined (all other characters from 'A' to 'Z' and from 'a' to 'z' are reserved):

Letter Date or Time Component Presentation Examples
G Era designator Text AD
y Year Year 1996; 96
M Month in year Month July; Jul; 07
w Week in year Number 27
W Week in month Number 2
D Day in year Number 189
d Day in month Number 10
F Day of week in month Number 2
E Day in week Text Tuesday; Tue
a Am/pm marker Text PM
H Hour in day (0-23) Number 0
k Hour in day (1-24) Number 24
K Hour in am/pm (0-11) Number 0
h Hour in am/pm (1-12) Number 12
m Minute in hour Number 30
s Second in minute Number 55
S Millisecond Number 978
z Time zone General time zone Pacific Standard Time; PST; GMT-08:00
Z Time zone RFC 822 time zone -0800

Examples

The following examples show how date and time patterns are interpreted in the U.S. locale. The given date and time are 2001-07-04 12:08:56 local time in the U.S. Pacific Time time zone.
Date and Time Pattern Result
"yyyy.MM.dd G 'at' HH:mm:ss z" 2001.07.04 AD at 12:08:56 PDT
"EEE, MMM d, ''yy" Wed, Jul 4, '01
"h:mm a" 12:08 PM
"hh 'o''clock' a, zzzz" 12 o'clock PM, Pacific Daylight Time
"K:mm a, z" 0:08 PM, PDT
"yyyyy.MMMMM.dd GGG hh:mm aaa" 02001.July.04 AD 12:08 PM
"EEE, d MMM yyyy HH:mm:ss Z" Wed, 4 Jul 2001 12:08:56 -0700
"yyMMddHHmmssZ" 010704120856-0700

This document is maintained by John Caron and was last updated on Nov 08, 2006

 

 
 
  Contact Us     Site Map     Search     Terms and Conditions     Privacy Policy     Participation Policy
 
National Science Foundation (NSF) UCAR Office of Programs University Corporation for Atmospheric Research (UCAR)   Unidata is a member of the UCAR Office of Programs, is managed by the University Corporation for Atmospheric Research, and is sponsored by the National Science Foundation.
P.O. Box 3000     Boulder, CO 80307-3000 USA     Tel: 303-497-8643     Fax: 303-497-8690