Dataset Inventory Catalog Specification Version 1.0A THREDDS catalog is a way to describe an inventory of available datasets. These catalogs provide a simple hierarchical structure for organizing a collection of datasets, a means of accessing each dataset, a human understandable name for each dataset, and a structure on which further descriptive information can be placed.
This document specifies the semantics of a THREDDS catalog, as well as its representation as an XML document.
Changes:
<xsd:element name="catalog">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="service" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element ref="property" minOccurs="0" maxOccurs="unbounded" />
<xsd:choice minOccurs="1" maxOccurs="unbounded">
<xsd:element ref="dataset"/>
<xsd:element ref="catalogRef"/>
</xsd:choice>
</xsd:sequence>
<xsd:attribute name="name" type="xsd:string" />
<xsd:attribute name="expires" type="dateType"/>
<xsd:attribute name="version" type="xsd:token" default="1.0" />
</xsd:complexType>
</xsd:element>
The catalog element is the top-level element. It may contain zero or more service elements followed by zero or more property elements followed by one or more dataset or catalogRef elements. The optional name will be displayed to a user. The version attribute is deprecated and if used, should be set to "1.0". The expires element tells clients until when this catalog is accurate, so they can cache the catalog information.
Example of simplest useful catalog:
<catalog xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" > <service name="aggServer" serviceType="DODS" base="http://acd.ucar.edu/dodsC/" /> <dataset name="SAGE III Ozone Loss" serviceName="aggServer" urlPath="sage.nc"/> </catalog>
<xsd:element name="service"> <xsd:complexType>
<xsd:sequence>
<xsd:element ref="property" minOccurs="0" maxOccurs="unbounded" />
<xsd:element ref="service" minOccurs="0" maxOccurs="unbounded" />
</xsd:sequence>
<xsd:attribute name="name" type="xsd:string" use="required" />
<xsd:attribute name="base" type="xsd:string" use="required" /> <xsd:attribute name="serviceType" type="serviceTypes" use="required" /> <xsd:attribute name="desc" type="xsd:string"/>
<xsd:attribute name="suffix" type="xsd:string" />
</xsd:complexType> </xsd:element>
A service element represents a data service. It must have a name unique for all service elements within the catalog (note that catalogs referenced by a catalogRef contain their own namespaces). It must have a base attribute and may have an optional suffix attribute which are used to construct the dataset URL (see constructing URLS). The base may be an absolute URL or relative to the catalog URL. It must have a serviceType attribute whose value is one of the serviceTypes values. The optional desc attribute allows you to give a human-readable description of the service.
Each access element must refer to a service element. Since typically there will be only a few service elements in a catalog but many dataset or access elements, a service element factors out the common properties of the data service for efficient representation within the catalog.
A service element may contain 0 or more property elements
Only service element with serviceType="Compound" may have nested service elements. Use Compound services when you systematically offer more than one way to access a dataset (e.g.DODS and FTP), and the access URLs are the same except for the service base. Nested service elements may also be used directly by dataset or access elements, and so must have unique names.
For the case that a catalog is written to describe the datasets from a particular data service, a useful idiom is to name the data service "this". However, there is no special semantics to a service called "this".
Example:
<service name="mcidasServer" serviceType="ADDE" base="http://motherlode.ucar.edu:8080/adde/" />Example with service base URL relative to catalog URL (see constructing URLS for how the resolved URL is created):
<service name="this" serviceType="DODS" base="dods/" />
<xsd:element name="dataset">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="service" minOccurs="0" maxOccurs="unbounded"/> <!-- deprecated -->
<xsd:group ref="threddsMetadataGroup" /> <xsd:element ref="access" minOccurs="0" maxOccurs="unbounded"/>
<xsd:choice minOccurs="0" maxOccurs="unbounded">
<xsd:element ref="dataset"/>
<xsd:element ref="catalogRef"/>
</xsd:choice> </xsd:sequence>
<xsd:attribute name="name" type="xsd:string" use="required"/>
<xsd:attribute name="alias" type="xsd:token"/> <xsd:attribute name="authority" type="xsd:string"/>
<xsd:attribute name="collectionType" type="collectionTypes"/>
<xsd:attribute name="dataType" type="dataTypes"/>
<xsd:attribute name="harvest" type="xsd:boolean"/>
<xsd:attribute name="ID" type="xsd:token"/>
<xsd:attribute name="serviceName" type="xsd:string" />
<xsd:attribute name="urlPath" type="xsd:token" /> </xsd:complexType>
</xsd:element>
A dataset element represents a named, logical set of data at a level of granularity appropriate for presentation to a user. A dataset is direct if it contains at least one access path, otherwise it is just a container for nested datasets, called a collection dataset. The name of the dataset element should be a human readable name that will be displayed to users. Multiple access methods specify different services for accessing the same dataset.
A dataset element contains any number of elements from the threddsMetadataGroup in any order. These are followed by 0 or more access elements, followed by 0 or more nested dataset or catalogRef elements. The data represented by a nested dataset or catalogRef element should be a subset, a specialization or in some other sense "contained" within the data represented by its parent dataset element. The use of service elements nested inside a dataset is deprecated; you should declare service elements at the top level of the catalog.
A dataset must have a name attribute, and may have other attributes. Its very important that the dataset name be quite descriptive, but succinct, since it is what users see when they are presented with lists of datasets to select from. We strongly recommend that each dataset be given a unique ID. The collectionType attribute is used to indicate a coherent collection dataset, which has only one level of nested datasets. The dataType attribute is very useful to clients to know how to present the data to the user. If the harvest attribute is true, then this dataset is available to be placed into digital libraries or other discovery services. Note that the harvest attribute should be carefully placed to get the right level of granularity for digital library entries, and is typically placed on collection datasets.
If you want the same dataset to appear in multiple places in the same catalog, use an alias attribute. Define it in one place (with all apropriate metadata), then wherever else it should appear, make a dataset with an alias to it, whose value is the ID of the defined dataset. ( Note it may not refer to a dataset in another catalog referred to by a catalogRef element.) In this case, any other properties of the dataset are ignored, and the dataset to which the alias refers is used in its place.
A dataset may have a naming authority specified within itself or in a parent dataset. If a dataset has an ID and a authority attribute, then the combination of the two should be globally unique for all time. If the same dataset is specified in multiple catalogs, then its authority - ID should be identical if possible.
The serviceName and urlPath attributes on the dataset element are used for the common case that a dataset has a single access. The serviceName refers to the unique name of a service element. The urlPath is appended to the service's base to get the dataset URL. (see constructing URLs). Logically the use of these two attributes creates an access element for this dataset. When you have more than one way to access a dataset, either explicitly define them using more than one nested access elements, or use a compound service.
Examples:
<dataset name="DC8 flight 1999-11-19" serviceName="agg" urlPath="SOLVE_DC8_19991119.nc"/>
<dataset ID="SOLVE_DC8_19991119" name="DC8 flight 1999-11-19, 1 min merge">
<metadata xlink:href="http://dataportal.ucar.edu/metadata/tracep_dc8_1min_05"/>
<access serviceName="disk" urlPath="SOLVE_DC8_19991119.nc"/>
</dataset>
An example using an alias; in this case the dataset referred to logically replaces the alias dataset.
<dataset name="Station Data">
<dataset name="Metar data" urlPath="cgi-bin/MetarServer.pl?format=qc" dataType="Station" />
<dataset name="Level 3 Radar data" urlPath="cgi-bin/RadarServer.pl?format=qc" dataType="Station"/>
<dataset name="Alias to SOLVE dataset" alias="SOLVE_DC8_19991119"/>
</dataset>
<xsd:element name="access">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="dataSize" minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="urlPath" type="xsd:token" use="required"/>
<xsd:attribute name="serviceName" type="xsd:string"/>
<xsd:attribute name="dataFormat" type="dataFormatTypes"/>
</xsd:complexType>
</xsd:element >
An access element specifies how a dataset can be accessed through a data service. It may contain an optional dataSize element to specify how large the dataset would be if it were to be copied to the client. It always refers to the dataset that it is immediately contained within.
The serviceName refers to the unique name of a service element. The urlPath is appended to the service's base to get the dataset URL (see constructing URLs). The dataFormat is important when the serviceType is a bulk transport like FTP or HTTP, as it specifies the format of the transferred file. It is not needed for client/server protocols like DODS or ADDE.
Example:
<access serviceName="ftpServer" urlPath="SOLVE_DC8_19991119.nc" dataFormat="NetCDF" />
<xsd:element name="catalogRef">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="documentation" type="documentationType" minOccurs="0" maxOccurs="unbounded" />
</xsd:sequence>
<xsd:attributeGroup ref="XLink" />
</xsd:complexType>
</xsd:element>
A catalogRef element refers to another catalog that becomes a dataset inside this catalog. This is used to separately maintain catalogs and to break up large catalogs. The referenced catalog should not be read until the user explicitly requests it, so that very large dataset collections can be represented with catalogRef elements without large delays in presenting them to the user. The referenced catalog is not textually substituted into the containing catalog, but remains a self-contained object. The referenced catalog must be a valid THREDDS catalog, but it does not have to match versions with the containing catalog.
The value of xlink:href is the URL of the referenced catalog. It may be absolute or relative to the catalog URL. The value of xlink:title is displayed as the name of the dataset that the user can click on to follow the XLink. You can add documentation elements to a catalogRef element.
Example:
<catalogRef xlink:title="NCEP Model Data" xlink:href="http://motherlode.ucar.edu:8080/catgen/uniModels.xml"/>
<xsd:attributeGroup name="XLink">
<xsd:attribute ref="xlink:href" />
<xsd:attribute ref="xlink:title" />
<xsd:attribute ref="xlink:show"/>
<xsd:attribute ref="xlink:type" /> </xsd:attributeGroup>
These are attributes from the XLink specification that are used to point to another web resource. The xlink:href attribute is used for the URL of the resource itself. The xlink:title attribute is what should be displayed to the user. These are the only two attributes currently used in the THREDDS software.You can also add the xlink:type or xlink:show attributes.
Example:
<documentation xlink:href="http://cloud1.arc.nasa.gov/solve/" xlink:title="SOLVE home page"/>
These are catalog elements that are used in Digital Libraries entries, discovery centers, and for annotation and documentation of datasets.
<xsd:group name="threddsMetadataGroup">
<xsd:choice minOccurs="0" maxOccurs="unbounded">
<xsd:element name="documentation" type="documentationType"/>
<xsd:element ref="metadata" /> <xsd:element ref="property" />
<xsd:element ref="contributor"/>
<xsd:element name="creator" type="sourceType"/> <xsd:element name="date" type="dateTypeFormatted"/> <xsd:element name="keyword" type="controlledVocabulary" />
<xsd:element name="project" type="controlledVocabulary" />
<xsd:element name="publisher" type="sourceType"/> <xsd:element ref="geospatialCoverage"/>
<xsd:element name="timeCoverage" type="timeCoverageType"/>
<xsd:element ref="variables"/> <xsd:element name="dataType" type="dataTypes"/>
<xsd:element name="dataFormat" type="dataFormatTypes"/>
<xsd:element name="serviceName" type="xsd:string" />
<xsd:element name="authority" type="xsd:string" />
<xsd:element ref="dataSize"/>
</xsd:choice>
</xsd:group>
The elements in the threddsMetadataGroup may be used either as nested elements of both dataset or a metadata elements. There may be any number of them in any order, but more than one geospatialCoverage, timeCoverage, dataType, dataFormat, serviceName, or authority elements will be ignored.
A documentation element contains (or points to) human-readable content that should be displayed to an end-user when making selections from the catalog. A metadata element is a container for machine-readable information structured in XML. A property element is an arbitrary name/value pair.
The next group of elements are used primarily for use in Digital Libraries. A contributor element is typically a person's name with an optional role attribute, documenting some person's contribution to the dataset. A creator element (aka originator) indicates who created the dataset. A date element is used to document various dates associated with the dataset, using one of the date type enumerations. A keyword element is used for library searches, while a project element specifies what scientific project the dataset belongs to. Both have type controlledVocabulary, which allows an optional vocabulary attribute to specify if you are using words from a restricted list, for example DIF. A publisher element indicates who is responsible for serving the dataset. Both a contibutor and publisher elemnt use the sourceType definition.
If your intention is to enable THREDDS to write entries into a Digital Library, you need to be aware of how elements are mapped to Digital Libraries. For example, you will probably need to add documentation elements, for example one with type summary, which will be the description of the dataset in the DL entry. Another docuementation element you may need has type rights which specifies what restrictions there are on the dataset usage.
The next group of elements are used in search services. The geospatialCoverage element specifies a lat/lon bounding box for the data. The timeCoverage element specifies the range of dates that the dataset covers. The variables element specifies the names of variables contained in the datasets, and ways to map the names to standard vocabularies.
The last group of elements are an alternative way to specify the dataType, dataFormat, serviceName, authority, and dataSize of a dataset. Also, by specifying these within a metadata element with inherit attribute set to "true", you can specify these values in a collection dataset and have them apply to all nested datasets.
Examples:
<documentation type="summary"> The SAGE III Ozone Loss and Validation Experiment (SOLVE)
was a measurement campaign designed to examine the processes controlling ozone levels
at mid- to high latitudes. Measurements were made in the Arctic high-latitude
region in winter using the NASA DC-8 and ER-2 aircraft,
as well as balloon platforms and ground-based instruments. </documentation>
<documentation type="rights"> Users of these data files are expected to follow the NASA
ESPO Archive guidelines for use of the SOLVE data, including consulting with the PIs
of the individual measurements for interpretation and credit.
</documentation>
<keyword>Ocean Biomass</keyword>
<project vocabulary="DIF">NASA Earth Science Project Office, Ames Research Center</project>
<xsd:complexType name="documentationType" mixed="true">
<xsd:sequence>
<xsd:any namespace="http://www.w3.org/1999/xhtml" minOccurs="0" maxOccurs="unbounded"
processContents="strict"/>
</xsd:sequence>
<xsd:attribute name="type" type="documentationEnumTypes"/>
<xsd:attributeGroup ref="XLink" />
</xsd:complexType>
<xsd:simpleType name="documentationEnumTypes">
<xsd:union memberTypes="xsd:token">
<xsd:simpleType>
<xsd:restriction base="xsd:token">
<xsd:enumeration value="funding"/>
<xsd:enumeration value="history"/>
<xsd:enumeration value="processing_level"/>
<xsd:enumeration value="rights"/>
<xsd:enumeration value="summary"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:union>
</xsd:simpleType>
The documentation type of element may contain arbitrary plain text content, or XHTML.We call this kind of content "human readable" information. It has an optional documentation type attribute, such as funding, history, summary, etc.
The documentation type of element may also contain an XLink to an HTML or plain text web page. This allows you to point to external web references, and also allows you factor out common documentation which can be referenced from multiple places. Note it should not link to an XML page (unless its XHTML), use the metadata element instead.
Examples:
<documentation xlink:href="http://espoarchive.nasa.gov/archive/index.html"
xlink:title="Earth Science Project Office Archives"/>
<documentation>Used in doubled CO2 scenario</documentation>
<xsd:element name="metadata">
<xsd:complexType>
<xsd:choice>
<xsd:group ref="threddsMetadataGroup" />
<xsd:any namespace="##other" minOccurs="0" maxOccurs="unbounded" processContents="strict"/>
</xsd:choice>
<xsd:attribute name="inherited" type="xsd:boolean" default="false" />
<xsd:attribute name="metadataType" type="metadataTypeEnums" />
<xsd:attributeGroup ref="XLink" />
</xsd:complexType> </xsd:element>
A metadata element contains or refers to structured information (in XML) about datasets, which is used by client programs to display, describe, or search for the dataset. We call this kind of content "machine readable" information.
A metadata element contains any number of elements from the threddsMetadataGroup in any order, OR it contains any other well-formed XML elements, as long as they are in a namespace other than the THREDDS namespace. It may also contain an XLink to another XML document, whose top-level element should be a valid metadata element (see example below). Note it should not link to an HTML page, use the documentation element instead.
The inherited attribute indicates whether the metadata is inherited by nested datasets. If true, the metadata element becomes logically part of each nested dataset.
The metadataType attribute may have any value, but the "well known" values are listed in the metadataType enumeration. To use metadata elements from the threddsMetadataGroup, do not include the metadatatype attribute (or set it to "THREDDS"). To use your own elements, give it a metadatatype, and add a namespace declaration (see example).
Examples:
// contains Thredds metadata <metadata inherited="true"> <contributor role="data manager">John Smith</contributor> <keyword>Atmospheric Science</keyword> <keyword>Aircraft Measurements</keyword> <keyword>Upper Tropospheric Chemistry</keyword> </metadata> // link to external file containing Thredds metadata <metadata xlink:href="http://dataportal.ucar.edu/metadata/solveMetadata.xml" xlink:title="Solve metadata" />
If you use an XLink, it should point to a document whose top element is a metadata element, which declares the THREDDS namespace:
<?xml version="1.0" encoding="UTF-8"?>
<metadata xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0"> <contributor role="Investigator">Mashor Mashnor</contributor>
<abstract>
This project aims to determine the physiological adaptations of algae to the
extreme conditions of Antarctica. </abstract>
<publisher>
<name vocabulary="DIF">AU/AADC</name>
<long_name vocabulary="DIF">Australian Antarctic Data Centre, Australia</long_name>
<contact url="http://www.aad.gov.au/default.asp?casid=3786" email="metadata@aad.gov.au"/>
</publisher>
</metadata>
When using elements from another namespace, all the subelements should be in the same namespace, which should be declared in the metadata element:
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>Goto considered harmful</dc:title >
<dc:description>The unbridled use of the go to statement has an immediate consequence that it becomes terribly hard to find a meaningful set of coordinates in which to describe the process progress. </dc:description> <dc:author>Edsger W. Dijkstra</dc:author>
</metadata>
If you use an XLink to point to elements from another namespace, add a metadataType attribute:
<metadata xlink:href="http://www.unidata.ucar.edu/metadata/ncep/dif.xml" xlink:title="NCEP DIF metadata" metadataType="DublinCore"/>
which should point to a document whose top element is a metadata element, which declares a different namespace (note you also still need to declare the THREDDS namespace):
<?xml version="1.0" encoding="UTF-8"?>
<metadata xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>Goto considered harmful</dc:title >
<dc:description>The unbridled use of the go to statement has an immediate consequence that it becomes terribly hard to find a meaningful set of coordinates in which to describe the process progress. </dc:description> <dc:author>Edsger W. Dijkstra</dc:author> </metadata>
This equivalent declaration makes the other namespace the default namespace:
<?xml version="1.0" encoding="UTF-8"?>
<cat:metadata xmlns:cat="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" xmlns="http://purl.org/dc/elements/1.1/">
<title>Goto considered harmful</title >
<description>The unbridled use of the go to statement has an immediate consequence that it becomes terribly hard to find a meaningful set of coordinates in which to describe the process progress. </description> <author>Edsger W. Dijkstra</author> </cat:metadata>
<xsd:element name="property">
<xsd:complexType>
<xsd:attribute name="name" type="xsd:string"/>
<xsd:attribute name="value" type="xsd:string"/>
</xsd:complexType>
</xsd:element>
Property elements are arbitrary name/value pairs to associate with a catalog, dataset or service element. Properties on datasets are added as global attributes to the THREDDS datamodel objects. More specialized semantics will be defined in the future.
Example:
<property name="Conventions" value="WRF" />
<xsd:complexType name="sourceType">
<xsd:sequence>
<xsd:element name="name" type="controlledVocabulary"/>
<xsd:element name="contact">
<xsd:complexType>
<xsd:attribute name="url" type="xsd:anyURI" use="required"/>
<xsd:attribute name="email" type="xsd:string" use="required"/>
</xsd:complexType> </xsd:element>
</xsd:sequence>
</xsd:complexType>
This is used by the creator and publisher elements to specify who is responsible for the dataset. It must have a name and contact element. The name element has an optional vocabulary attribute if it come from a controlled vocabulary. The contact element has attributes to specify a web url and an email address.
Example:
<publisher>
<name vocabulary="DIF">UCAR/NCAR/CDP > Community Data Portal, National Center for Atmospheric Research, University Corporation for Atmospheric Research</name> <contact url="http://dataportal.ucar.edu" email="cdp@ucar.edu"/>
</publisher>
<xsd:element name="contributor">
<xsd:complexType>
<xsd:simpleContent>
<xsd:extension base="xsd:string">
<xsd:attribute name="role" type="xsd:string" use="required"/>
</xsd:extension>
</xsd:simpleContent>
</xsd:complexType>
</xsd:element>
A contributor is simply a person's name with an optional role attribute that specifies the role that the person plays with regard to this dataset.
Example:
<contributor role="PI">Jane Doe</contributor>
<xsd:element name="geospatialCoverage">
<xsd:complexType>
<xsd:sequence> <xsd:element name="northsouth" type="spatialRange" minOccurs="0" />
<xsd:element name="eastwest" type="spatialRange" minOccurs="0" />
<xsd:element name="updown" type="spatialRange" minOccurs="0" />
<xsd:element name="name" type="controlledVocabulary" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="zpositive" type="upOrDown" default="up"/>
</xsd:complexType>
</xsd:element> <xsd:complexType name="spatialRange">
<xsd:sequence>
<xsd:element name="start" type="xsd:double" />
<xsd:element name="size" type="xsd:double" />
<xsd:element name="resolution" type="xsd:double" minOccurs="0" />
<xsd:element name="units" type="xsd:string" minOccurs="0" />
</xsd:sequence>
</xsd:complexType>
<xsd:simpleType name="upOrDown">
<xsd:restriction base="xsd:token">
<xsd:enumeration value="up"/>
<xsd:enumeration value="down"/>
</xsd:restriction>
</xsd:simpleType>
A geospatialCoverage element specifies a lat/lon bounding box, and an altitude range that the data covers.
The northsouth and eastwest elements should both be set to specify a lat/lon bounding box. The default units are degrees_north and degrees_east, respectively. The updown element specifies the altitude range, with default units in meters. A zpositive value of up means that z increases up, like units of height, while a value of down means that z increases downward, like units of pressure or depth. The spatialRange elements indicate that the range goes from start to start + size. Use the resolution attribute to indicate the data resolution.
You can optionally add any number of names to describe the covered region. An important special case is global coverage, where you should use the name global (see example below):
Example:
<geospatialCoverage zpositive="down">
<northsouth>
<start>10</start>
<size>80</size>
<resolution>2</resolution>
<units>degrees_north</units>
</northsouth>
<eastwest>
<start>-130</start>
<size>260</size>
<resolution>2</resolution>
<units>degrees_east</units>
</eastwest>
<updown>
<start>0</start>
<size>22</size>
<resolution>0.5</resolution>
<units>km</units>
</updown>
</geospatialCoverage>
<geospatialCoverage>
<name vocabulary="Thredds">global</name>
</geospatialCoverage>
<xsd:complexType name="timeCoverageType"> <xsd:sequence>
<xsd:choice>
<xsd:sequence>
<xsd:element name="start" type="dateTypeFormatted" />
<xsd:element name="end" type="dateTypeFormatted" />
</xsd:sequence>
<xsd:sequence>
<xsd:element name="start" type="dateTypeFormatted" />
<xsd:element name="duration" type="duration"/>
</xsd:sequence>
<xsd:sequence>
<xsd:element name="end" type="dateTypeFormatted" />
<xsd:element name="duration" type="duration"/>
</xsd:sequence> </xsd:choice> <xsd:element name="resolution" type="duration" minOccurs="0" /> </xsd:sequence>
</xsd:complexType>
A timeCoverage element specifies a date range, either using a start/end date elements a start date / duration elements, or an end date / duration elements. The optional resolution element should be used to indicate the data resolution for time series data.
Example:
<timeCoverage>
<start>1999-11-16T12:00:00</start>
<end>present</end>
</timeCoverage> <timeCoverage>
<start>1999-11-16T12:00:00</start>
<duration>P3M</duration> // 3 months
</timeCoverage> <timeCoverage> // 10 days before the present up to the present
<end>present</end>
<duration>10 days</duration> <resolution>15 minutes</resolution>
</timeCoverage>
<xsd:simpleType name="dateType">
<xsd:union memberTypes="xsd:date xsd:dateTime udunitDate">
<xsd:simpleType>
<xsd:restriction base="xsd:token">
<xsd:enumeration value="present"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:union>
</xsd:simpleType> <xsd:simpleType name="udunitDate">
<xsd:restriction base="xsd:string">
<xsd:annotation>
<xsd:documentation>Must conform to complete udunits date string, eg "20 days since 1991-01-01"</xsd:documentation>
</xsd:annotation>
</xsd:restriction>
</xsd:simpleType>
A dateType follows the W3C profile of ISO 8601 for date/time formats. Note that it is a simple type, so that it can be used as the type of an attribute. It can be one of the following:
Examples:
<start>1999-11-16</start> <start>1999-11-16T12:00:00</start> // implied UTC <start>1999-11-16T12:00:00Z</start> // explicit UTC <start>1999-11-16T12:00:00-05:00</start> // EST time zone specified
<start>20 days since 1991-01-01</start>
<start>present</start>
<xsd:complexType name="dateTypeFormatted">
<xsd:simpleContent>
<xsd:extension base="dateType">
<xsd:attribute name="format" type="xsd:string" /> // from java.text.SimpleDateFormat
<xsd:attribute name="type" type="dateEnumTypes" />
</xsd:extension>
</xsd:simpleContent>
</xsd:complexType> <xsd:simpleType name="dateEnumTypes">
<xsd:union memberTypes="xsd:token">
<xsd:simpleType>
<xsd:restriction base="xsd:token">
<xsd:enumeration value="created"/>
<xsd:enumeration value="modified"/>
<xsd:enumeration value="valid"/>
<xsd:enumeration value="issued"/>
<xsd:enumeration value="available"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:union>
</xsd:simpleType>
A dateTypeFormatted extends dateType by allowing an optional, user-defined format attribute and an optional type attribute. The format string follows the specification in java.text.SimpleDateFormat. The values of the type attribute are taken from the Dublin Core date types.
Example:
<start format="yyyy DDD" type="created">1999 189</start> <!-- year, day of year --> _Example_Format_String___________Example_Text___________________
"yyyy.MM.dd G 'at' HH:mm:ss z" 2001.07.04 AD at 12:08:56 PDT
"EEE, MMM d, ''yy" Wed, Jul 4, '01
"K:mm a, z" 0:08 PM, PDT
"yyyyy.MMMMM.dd GGG hh:mm aaa" 02001.July.04 AD 12:08 PM
"EEE, d MMM yyyy HH:mm:ss Z" Wed, 4 Jul 2001 12:08:56 -0700
"yyMMddHHmmssZ" 010704120856-0700
<xsd:simpleType name="duration">
<xsd:union memberTypes="xsd:duration udunitDuration" />
</xsd:simpleType> <xsd:simpleType name="udunitDuration">
<xsd:restriction base="xsd:string">
<xsd:annotation>
<xsd:documentation>Must conform to udunits time duration, eg "20.1 hours" </xsd:documentation>
</xsd:annotation>
</xsd:restriction>
</xsd:simpleType>
A duration type can be one of the following:
Example:
<duration>P5Y2M10DT15H</duration> <duration>5 days</duration>
<xsd:element name="dataSize">
<xsd:complexType>
<xsd:simpleContent>
<xsd:extension base="xsd:string">
<xsd:attribute name="units" type="xsd:string" use="required"/>
</xsd:extension>
</xsd:simpleContent>
</xsd:complexType> </xsd:element>
A dataSize element is just a number with a units attribute, which should be "bytes", "Kbytes", "Mbytes", "Gbytes" or "Tbytes".
Example:
<dataSize units="Kbytes">123</dataSize>
<xsd:complexType name="controlledVocabulary">
<xsd:simpleContent>
<xsd:extension base="xsd:string">
<xsd:attribute name="vocabulary" type="xsd:string" />
</xsd:extension>
</xsd:simpleContent>
</xsd:complexType>
A controlledVocabulary simply adds an optional vocabulary attribute to the string-valued element, indicating that the value comes from a restricted list.
Example:
<name vocabulary="DIF">UCAR/NCAR/CDP</name>
<xsd:element name="variables">
<xsd:complexType>
<xsd:choice>
<xsd:element ref="variable" minOccurs="0" maxOccurs="unbounded" />
<xsd:element ref="variableMap" minOccurs="0" />
</xsd:choice>
<xsd:attribute name="vocabulary" type="variableNameVocabulary" use="required"/>
<xsd:attributeGroup ref="XLink" />
</xsd:complexType>
</xsd:element> <xsd:element name="variable">
<xsd:complexType>
<xsd:attribute name="name" type="xsd:string" use="required"/>
<xsd:attribute name="vocabulary_name" type="xsd:string" use="required"/>
<xsd:attribute name="units" type="xsd:string" />
</xsd:complexType>
</xsd:element> <xsd:element name="variableMap">
<xsd:complexType>
<xsd:attributeGroup ref="XLink" use="required"/>
</xsd:complexType>
</xsd:element>
A variables element contains a list of variables OR a variableMap element that refers to another document that contains a list of variables. This list specifies the variables (aka fields or parameters) that are available in the dataset, and associates them with a standard vocabulary of names, through the vocabulary attribute. The optional XLink is a reference to an online resource describing the standard vocabulary. The possible format(s) of this online resource is yet to be determined.
Each variable element contains the name of the variable in the dataset, and its name in the standard vocabulary, as well as an optional units attribute. A variableMap element contains an XLink to variable elements, so that you can factor these out and refer to them from multiple places.
The main purpose of the variables element is to describe a dataset for a search service or digital library, for example GCMD requires a list of dataset "Parameter Valids" from their controlled vocabulary. A client might want to show those "standard variable names" to a user, since the names may be more meaningful than the actual variable names.
Examples:
<variables vocabulary="CF-1.0">
<variable name="wv" vocabulary_name="Wind Speed" units="m/s"/>
<variable name="wdir" vocabulary_name="Wind Direction" units= "degrees"/>
<variable name="o3c" vocabulary_name="Ozone Concentration" units="g/g"/>
</variables> <variables vocabulary="GRIB-NCEP" xlink:href="http://www.unidata.ucar.edu//GRIB-NCEPtable2.xml">
<variableMap xlink:href="../standardQ/Eta.xml" />
</variables>
A varibleMap should point to an XML document with a top-level variables element with the THREDDS namespace declared:
<?xml version="1.0" encoding="UTF-8"?> <variables xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" >
<variable name="wv" vocabulary_name="Wind Speed" units="m/s"/>
<variable name="wdir" vocabulary_name="Wind Direction" units= "degrees"/>
<variable name="o3c" vocabulary_name="Ozone Concentration" units="g/g"/> ...
</variables>
The remaining definitions are all enumerations of "well-known" values. Note that for all of these, any token is a legal value. However, standard software is likely to understand only the values that are explicitly listed. We encourage you to use these well-known values if possible, and to submit new values to the THREDDS mailgroup for inclusion in future versions of this schema.
<xsd:simpleType name="collectionTypes">
<xsd:union memberTypes="xsd:token">
<xsd:simpleType>
<xsd:restriction base="xsd:token">
<xsd:enumeration value="TimeSeries"/>
<xsd:enumeration value="Stations"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:union>
</xsd:simpleType>
These are the types of coherent dataset collections, used in a dataset element. This will be elaborated in future versions.
<!-- DataFormat Types -->
<xsd:simpleType name="dataFormatTypes">
<xsd:union memberTypes="xsd:token mimeType">
<xsd:simpleType>
<xsd:restriction base="xsd:token">
<xsd:enumeration value="BUFR"/> <xsd:enumeration value="ESML"/>
<xsd:enumeration value="Gempak"/>
<xsd:enumeration value="GRIB-1"/>
<xsd:enumeration value="GRIB-2"/> <xsd:enumeration value="HDF4"/>
<xsd:enumeration value="HDF5"/>
<xsd:enumeration value="NcML"/>
<xsd:enumeration value="NetCDF"/>
<xsd:enumeration value="image/gif"/>
<xsd:enumeration value="image/jpeg"/>
<xsd:enumeration value="image/tiff"/>
<xsd:enumeration value="text/plain"/>
<xsd:enumeration value="text/tab-separated-values"/>
<xsd:enumeration value="text/xml"/>
<xsd:enumeration value="video/mpeg"/>
<xsd:enumeration value="video/quicktime"/>
<xsd:enumeration value="video/realtime"/> </xsd:restriction>
</xsd:simpleType>
</xsd:union>
</xsd:simpleType> <xsd:simpleType name="mimeType">
<xsd:restriction base="xsd:token">
<xsd:annotation>
<xsd:documentation>any valid mime type (see http://www.iana.org/assignments/media-types/) </xsd:documentation>
</xsd:annotation>
</xsd:restriction>
</xsd:simpleType>
These describe the data formats, used in an access attribute or dataset element, when the service is a bulk transport (like FTP) and the client has to know how to read the downloaded dataset file. ESML stands for Earth System Markup Language, a standard way to describe binary data files. The NetCDF type is for NetCDF files, and NcML is a way to annotate and extend NetCDF files with the NetCDF Markup Language. HDF4 is for HDF4 and HDF-EOS files, while HDF5 is for HDF5 formatted files. GRIB-1, GRIB-2 and BUFR are WMO's GRIB version 1, version 2 and BUFR data formats, respectively. Gempak refers to GEMPAK formatted files.
In addition to the file formats explicitly listed, you can use a mime type. We have also listed the ones that seem likely to be relevent.
You can also use your own scientific file format; send us them and we will add it to this list (check to see if its a mime type first).
Examples:
<dataFormat>NcML</dataFormat> <dataFormat>image/gif</dataFormat> <dataFormat>image/jpeg</dataFormat> <dataFormat>image/png</dataFormat> <dataFormat>video/mpeg</dataFormat> <dataFormat>video/quicktime</dataFormat>
<xsd:simpleType name="dataTypes">
<xsd:union memberTypes="xsd:token">
<xsd:simpleType>
<xsd:restriction base="xsd:token">
<xsd:enumeration value="Grid"/>
<xsd:enumeration value="Image"/>
<xsd:enumeration value="Station"/> <xsd:enumeration value="Swath"/>
<xsd:enumeration value="Trajectory"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:union>
</xsd:simpleType>
These are the data types of the datasets, which are used by clients to know how to display the data. This will be elaborated in future versions.
<xsd:simpleType name="dateEnumTypes">
<xsd:union memberTypes="xsd:token">
<xsd:simpleType>
<xsd:restriction base="xsd:token">
<xsd:enumeration value="created"/>
<xsd:enumeration value="modified"/>
<xsd:enumeration value="valid"/>
<xsd:enumeration value="issued"/>
<xsd:enumeration value="available"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:union>
</xsd:simpleType>
These are the well-known types for a date element, taken from the Dublin Core metadata set.
<xsd:simpleType name="documentationEnumTypes">
<xsd:union memberTypes="xsd:token">
<xsd:simpleType>
<xsd:restriction base="xsd:token">
<xsd:enumeration value="funding"/>
<xsd:enumeration value="history"/>
<xsd:enumeration value="processing_level"/>
<xsd:enumeration value="rights"/>
<xsd:enumeration value="summary"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:union>
</xsd:simpleType>
These are the well-known types for the documentation element.
<xsd:simpleType name="metadataTypeEnum">
<xsd:union memberTypes="xsd:token">
<xsd:simpleType>
<xsd:restriction base="xsd:token">
<xsd:enumeration value="THREDDS"/>
<xsd:enumeration value="ADN"/>
<xsd:enumeration value="Aggregation"/>
<xsd:enumeration value="CatalogGenConfig"/>
<xsd:enumeration value="DublinCore"/>
<xsd:enumeration value="DIF"/>
<xsd:enumeration value="FGDC"/>
<xsd:enumeration value="LAS"/>
<xsd:enumeration value="NetCDF"/>
<xsd:enumeration value="ESG"/>
<xsd:enumeration value="Other"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:union>
</xsd:simpleType>
These are the well-known types for the metadata element.
<xsd:simpleType name="serviceTypes">
<xsd:union memberTypes="xsd:token">
<xsd:simpleType>
<xsd:restriction base="xsd:token">
<!-- client/server -->
<xsd:enumeration value="ADDE"/>
<xsd:enumeration value="DODS"/> <!-- same as OpenDAP -->
<xsd:enumeration value="OpenDAP"/>
<xsd:enumeration value="OpenDAP-G"/>
<!-- bulk transport -->
<xsd:enumeration value="HTTPServer"/> <xsd:enumeration value="FTP"/>
<xsd:enumeration value="GridFTP"/> <xsd:enumeration value="File"/>
<!-- web services -->
<xsd:enumeration value="LAS"/> <xsd:enumeration value="WMS"/>
<xsd:enumeration value="WFS"/>
<xsd:enumeration value="WCS"/>
<xsd:enumeration value="WSDL"/>
<!--offline -->
<xsd:enumeration value="WebForm"/>
<!-- THREDDS -->
<xsd:enumeration value="Catalog"/>
<xsd:enumeration value="QueryCapability"/>
<xsd:enumeration value="Resolver"/>
<xsd:enumeration value="Compound"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:union>
</xsd:simpleType>
These are the known service types, used in a service element, that indicate how to access a dataset. A serviceType is similar, but not generally the same as the scheme of a URI, like http:, ftp:, file:, etc. In general, the combination of the serviceType and the dataFormat is intended to be sufficient for a client to access and read the dataset. Additional information can be encoded in service properties.
The OpenDAP and ADDE service types correspond to the OpenDAP and ADDE data access protocols. These are client/server protocols that specify both the access (or transport) protocol and the data model, so no seperate dataFormat attribute is needed. DODS is a synonym for OpenDAP; OpenDAP-G corresponds to OpenDAP over GridFTP.
The next set of service types are all bulk transfer protocols, and you need to also specify the dataFormat for datasets that use these. FTP is the well-known File Transfer Protocol, and GridFTP is a variant of that used by the Globus Data Grid. The File service is for local files, used for local catalogs or in situations like DODS Aggregation Server configuration. A File dataset is not readable by remote clients. HTTPServer should be used when your file is being served by an HTTP (Web) Server. This is used for bulk transfer just like FTP, and also can be used by the Java-NetCDF library to access NetCDF files remotely (in that case just make sure that the dataset has dataFormatType NetCDF or NcML).
The LAS service type is for connection to Live Access Servers. WMS, WFS and WCS are for the Web Map, Feature, and Coverage Servers, respectively, from the OpenGIS Consortium. These are still experimental servers, at least for THREDDS. WSDL corresponds to a server using the Web Services Description Language to specify its data services. We do not yet have an example of that within THREDDS.
The WebForm service indicate that the dataset URL will take you to an HTML page where you can presumably order the data in some way, to be delivered later. Its still a good idea to specify the dataset dataFormatType.
The last set of service types are THREDDS defined types. The Catalog, QueryCapability, and Resolver types all return XML documents over HTTP. These are generally handled internaally by THREDDS widgets. A Compound service just indicates that the service is composed of other services.
<xsd:simpleType name="variableNameVocabulary">
<xsd:union memberTypes="xsd:string">
<xsd:simpleType>
<xsd:restriction base="xsd:token">
<xsd:enumeration value="CF"/>
<xsd:enumeration value="DIF"/>
<xsd:enumeration value="GRIB-1"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:union>
</xsd:simpleType>
These are the known vocabularies for standard variable names, used in the variables element. CF refers to the Climate and Forecast Conventions metadata conventions for netCDF; they have a list of standard variable names. DIF is Directory Interchange Format from NASA's Global Change Master Directory, which has a controlled variable classification scheme. The World Meteorological Organization's GRIB (version 1) data file format defines a set of standard parameters.
You can also use another vocabulary name; send it to us and we will add it to this list.
The standard way to construct a dataset URL is to concatenate the service base with the access urlPath. If the service also has a sufffix, that is appended:
URL = service.base + access.urlPath + service.suffix
A common mistake is to forget the trailing slash at the end of the service base URL.
Clients have access to each of these elements and may make use of the URL in protocol-specific ways. For example the OpenDAP (DODS) protocol appends dds, das, dods etc to make the actual calls to the OpenDAP server.
When a service base is a relative URL, it is resolved against the catalog URL. For example if the catalog URL is http://motherlode.ucar.edu:8080/thredds/dodsC/catalog.xml, and a service base is airtemp/, then the resolved base is http://motherlode.ucar.edu:8080/thredds/dodsC/airtemp/. Note that if the service base is /airtemp/, the resolved URL is http://motherlode.ucar.edu:8080/airtemp/. The java.net.URI class in JDK 1.4+ will resolve relative URLs.
THREDDS Dataset Inventory Catalogs organize and describe collections of data. A catalog can be thought of as a logical directory of data resources available via the Internet. A dataset may be a direct dataset (describes how to directly access data over the Internet), a collection dataset (contains other datasets) or a dynamic dataset (content is generated by a call to a server).
A direct dataset has an access URL and a service type like FTP, DODS, ADDE, etc. that allows a THREDDS-enabled application to directly access its data, using the specified service's protocol. It is represented simply by a <dataset> element.
A collection dataset is represented by a <dataset> with nested <dataset> elements. We distinguish two types:
Both direct datasets and coherent collections are datasets that an application might want to act on, e.g. visualize, so we'll call them application datasets.
A dynamic dataset has an access URL and a service type Catalog, Resolver, or QueryCapability. Its contents are typically generated dynamically by making a call to a server, and describe datasets that are constantly changing, and/or are too large to list exhaustively.
A query dataset looks a lot like a catalogRef, since you dereference a URL and get a catalog back. However, a catalogRef is cacheable, but a query dataset is inherently dynamic, so is not cacheable.
Its important to distinguish a THREDDS dataset from its access URL. A dataset can have multiple ways of being accessed, and so have multiple access URLs. But even in the simple case that a dataset has one access URL, the dataset potentially contains metadata that is not stored with the data pointed to by its access URL. In order to use the full power of THREDDS, you must work with the full dataset object, not just with its access URL.
A THREDDS dataset is an abstract object, containing various properties and other objects, as described in this document, along with their semantics. One implementation of THREDDS datasets can be found in the THREDDS catalog Java library. This document also describes one representation of THREDDS datasets using XML, and the catalog library does serialization between its dataset objects and their XML encoding.
In order to make a dataset into a Web resource, it needs to have a URL which refers to it, different than its access URL(s). One way to do this is to use XPath to reference the dataset XML element inside the catalog. However, because of inheritence and other complexities, its not trivial to extract its complete XML representation seperated from the other datasets in a catalog. The THREDDS dataset subsetting service allows you to obtain a catalog that contains just a selected dataset (and nested objects), semantically equivilent to the original dataset. It currently requires that the selected dataset have an ID, but it can be generalized to handle any XPath expression. The syntax of that service is:
http://<host>/thredds/subset?catalog=<catalog>&dataset=<ID>
Example:
http://motherlode.ucar.edu:8080/thredds/subset?dataset=Station
July 07, 2004. Clarify the relation of dateType to ISO 8601.
June 1, 2005. Clean up some of the anchors, as well as the index.