|
|
|||
|
||||
TDS catalog element Specification, Version 1.0.1The THREDDS Data Server (TDS) uses specialized catalogs as configuration documents. Several elements have been added to the InvCatalog schema to allow for this server-side usage.
This document specifies the semantics and XML representation of the server-side specializations allowed in THREDDS catalogs.
Change History:
<xsd:element name="datasetScan" substitutionGroup="dataset">
<xsd:complexType>
<xsd:complexContent>
<xsd:extension base="DatasetType">
<xsd:sequence>
<xsd:element ref="filter" minOccurs="0" />
<xsd:element ref="addID" minOccurs="0" />
<xsd:element ref="namer" minOccurs="0" />
<xsd:element ref="sort" minOccurs="0" />
<xsd:element ref="addLatest" minOccurs="0" />
<xsd:element ref="addProxies" minOccurs="0" />
<xsd:element name="addDatasetSize" minOccurs="0" />
<xsd:element ref="addTimeCoverage" minOccurs="0" />
</xsd:sequence>
<xsd:attribute name="path" type="xsd:string" use="required"/>
<xsd:attribute name="location" type="xsd:string"/>
<xsd:attribute name="dirLocation" type="xsd:string"/> <!-- deprecated : use location attribute -->
<xsd:attribute name="filter" type="xsd:string"/> <!-- deprecated : use filter element -->
<xsd:attribute name="addDatasetSize" type="xsd:boolean"/> <!-- deprecated : use enhance/addDatasetSize element -->
<xsd:attribute name="addLatest" type="xsd:boolean"/> <!-- deprecated : use addLatest element -->
<xsd:attribute name="addId" type="xsd:boolean"/> <!-- deprecated : use addID element -->
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
</xsd:element>
The datasetScan element allows for the generation of nested THREDDS catalogs by scanning the dataset collection location named in the location attribute. The path attribute is used to map dataset and catalog requests to a given datasetScan.
A datasetScan element is in the dataset substitutionGroup, so it can be used wherever a dataset element can be used. It is an extension of a DatasetType, so any of dataset's nested elements and attributes can be used in it. This allows you to add enhanced metadata to a datasetScan. However you should not add nested datasets, as these will be ignored.
By default, each generated catalog will include all datasets at the
requested level of the given dataset collection location. Each
collection (directory) dataset will be included as a catalogRef element and each atomic (file) dataset will be included as a dataset element. The name of the resulting dataset or catalogRef will be the name of the corresponding dataset. No metadata will be added other than that contained in the datasetScan
element which will be added as appropriate at the different level of
the given dataset collection location depending on if it is inherited
metadata or not.
The datasetScan specific nested elements (filter, addID, namer, sort, addLatest, addProxies, addDatasetSize, and addTimeCoverage) can be used to modify the default behavior or add metadata.
This very simple example:
<datasetScan name="GRIB2 Data" path="grib2" location="C:/data/grib2/" >
<dataFormat>GRIB-2</dataFormat>
</datasetScan >
Might result in the following catalog:
<catalog ...>
<service name="myserv" ... />
<dataset name="GRIB2 Data">
<metadata inherited="true"><serviceName>myserv</serviceName></metadata>
<dataset name="data1.wmo" urlPath="data1.wmo" />
<dataset name="data2.wmo" urlPath="data2.wmo" />
<dataset name="readme.txt" urlPath="readme.txt" />
<catalogRef xlink:title="test" xlink:href="test" name="" />
</dataset>
</catalog>
<xsd:element name="filter">
<xsd:complexType>
<xsd:choice>
<xsd:sequence minOccurs="0" maxOccurs="unbounded">
<xsd:element name="include" type="FilterSelectorType" minOccurs="0"/>
<xsd:element name="exclude" type="FilterSelectorType" minOccurs="0"/>
</xsd:sequence>
</xsd:choice>
</xsd:complexType>
</xsd:element>
<xsd:complexType name="FilterSelectorType">
<xsd:attribute name="regExp" type="xsd:string"/>
<xsd:attribute name="wildcard" type="xsd:string"/>
<xsd:attribute name="atomic" type="xsd:boolean"/>
<xsd:attribute name="collection" type="xsd:boolean"/>
</xsd:complexType>
The filter element allows users to specify which datasets are
to be included in the generated catalogs. A filter element can contain
any number of include and exclude elements. Each include or exclude
element may contain either a wildcard or a regExp attribute. If the given wildcard pattern or regular expression
matches a dataset name, that dataset is included or excluded as
specified. By default, includes and excludes apply only to atomic
datasets (regular files). You can specify that they apply to atomic
and/or collection datasets (directories) by using the atomic and collection attributes. or a specify either a wildcard pattern or a regular expression
pattern with which a dataset name is matched. They can also specify
whether they apply to atomic and/or collection datasets (the default is
to apply to atomic datasets only).
Expanding on the above example:
<datasetScan name="GRIB2 Data" path="grib2" location="C:/data/grib2/" >results in:
<dataFormat>GRIB-2</dataFormat>
<filter>
<include wildcard="*.wmo" />
</filter>
</datasetScan >
<catalog ...>
<service name="myserv" ... />
<dataset name="GRIB2 Data">
<metadata inherited="true"><serviceName>myserv</serviceName></metadata>
<dataset name="data1.wmo" urlPath="data1.wmo" />
<dataset name="data2.wmo" urlPath="data2.wmo" />
</dataset>
</catalog>
More examples are available in the TDS datasetsScan documentation.
<xsd:element name="addID" />
The addID element specifies that a datasetScan should add an ID attribute to each dataset element included in a resulting catalog.
The TDS adds ID attributes by default even if no addID element is
given in the datasetScan. The IDs are constructed by concatenating the
relative path of the generated dataset to either the datasetScan
ID (if it exists) or the datasetScan path.
So the example results from the filter section above would more accurately be:
<catalog ...>
<service name="myserv" ... />
<dataset name="GRIB2 Data" ID="grib2">
<metadata inherited="true"><serviceName>myserv</serviceName></metadata>
<dataset name="data1.wmo" ID="grib2/data1.wmo" urlPath="data1.wmo" />
<dataset name="data2.wmo" ID="grib2/data2.wmo" urlPath="data2.wmo" />
</dataset>
</catalog>
<xsd:element name="namer">
<xsd:complexType>
<xsd:choice maxOccurs="unbounded">
<xsd:element name="regExpOnName" type="NamerSelectorType"/>
<xsd:element name="regExpOnPath" type="NamerSelectorType"/>
</xsd:choice>
</xsd:complexType>
</xsd:element>
<xsd:complexType name="NamerSelectorType">
<xsd:attribute name="regExp" type="xsd:string"/>
<xsd:attribute name="replaceString" type="xsd:string"/>
</xsd:complexType>
The namer element specifies one or more methods for renaming resulting dataset and catalogRef elements. Currently, two methods for renaming are available. Both methods use regular expression matching and capturing group replacement to determine the new
name. The first method, specified by the regExpOnName element, does regular
expression matching on the dataset name. The second method, specified by the regExpOnPath element, does regular expression matching on the entire dataset path. In either method, the regExp attribute contains the regular expression used in matching on the name or path and the replaceString attribute contains the replacement string on which capturing group replacement is performed.
A capturing group is a part of a regular expression enclosed in parenthesis. When a regular expression with a capturing group is applied to a string, the substring that matches the capturing group is saved for later use. The captured strings can then be substituted into another string in place of capturing group references,"$n", where "n" is an integer indicating a particular capturing group. (The capturing groups are numbered according to the order in which they appear in the match string.) For example, the regular expression "Hi (.*), how are (.*)?" when applied to the string "Hi Fred, how are you?" would capture the strings "Fred" and "you". Following that with a capturing group replacement in the string "$2 are $1." would result in the string "you are Fred."
Here's an example namer:
<namer>
<regExpOnName regExp="([0-9]{4})([0-9]{2})([0-9]{2})_([0-9]{2})([0-9]{2})"
replaceString="NCEP GFS 191km Alaska $1-$2-$3 $4:$5:00 GMT"/>
</namer
the regular expression has five capturing groups
<dataset name="NCEP GFS 191km Alaska 2005-10-11 00:00:00 GMT"
urlPath="models/NCEP/GFS/Alaska_191km/GFS_Alaska_191km_20051011_0000.grib1"/>
<xsd:element name="sort">
<xsd:complexType>
<xsd:choice>
<xsd:element name="lexigraphicByName">
<xsd:complexType>
<xsd:attribute name="increasing" type="xsd:boolean"/>
</xsd:complexType>
</xsd:element>
</xsd:choice>
</xsd:complexType>
</xsd:element>
Without a sort element, datasets at each collection level are listed in their "natural" order. The sort element specifies how to order those datasets. Currently, a sort element can only contain one lexigraphicByName element which indicates that datasets should be ordered lexigraphically according to the dataset name. The increasing attribute in the lexigraphicByName element indicates whether the datasets should in increasing or decreasing order.
<xsd:element name="addLatest">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="simpleLatest" minOccurs="0">
<xsd:complexType>
<xsd:attribute name="name" type="xsd:string"/>
<xsd:attribute name="top" type="xsd:boolean"/>
<xsd:attribute name="serviceName" type="xsd:string"/>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
The addLatest element is deprecated in favor of the addProxies element.
<xsd:element name="addProxies">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="simpleLatest" minOccurs="0">
<xsd:complexType>
<xsd:attribute name="name" type="xsd:string"/>
<xsd:attribute name="top" type="xsd:boolean"/>
<xsd:attribute name="serviceName" type="xsd:string"/>
</xsd:complexType>
</xsd:element>
<xsd:element name="latestComplete" minOccurs="0">
<xsd:complexType>
<xsd:attribute name="name" type="xsd:string"/>
<xsd:attribute name="top" type="xsd:boolean"/>
<xsd:attribute name="serviceName" type="xsd:string"/>
<xsd:attribute name="lastModifiedLimit" type="xsd:float"/>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
The addProxies element provides a place for
describing proxy datasets to be added to each collection
dataset under a datasetScan.
Currently, two types of proxy datasets
are supported. They are both intended to proxy the "latest" dataset in
the scanned collection. The first type of proxy dataset, specified by
the simpleLatest
element, adds a dataset that proxies the existing dataset whose name is
lexigraphically greatest. This dataset will be the "latest" if the
dataset name contains a timestamp. The simpleLatest element may contain a name attribute which is used as the name of the proxy dataset, the serviceName attribute that references the service element that is to be referenced by the resulting proxy dataset, and the top
attribute which indicates if the proxy dataset should appear at the top
or bottom of the list of dataset in this collection. Default behavior
in the TDS if any these attributes are missing is to name the dataset
"latest.xml", reference the "latest" service, and place the dataset at
the top of the collection.
The second type of proxy dataset, specified by the latestComplete
element, is the same as the simple latest except that it will exclude
any dataset that was last modified within the number of minutes
specified by the lastModifedLimit attribute. It must contain all the attributes allowed in the simpleLatest element plus the lastModifiedLimit attribute.
An example is available in the TDS datasetsScan documentation.
<xsd:element name="addTimeCoverage">
<xsd:complexType>
<xsd:attribute name="datasetNameMatchPattern" type="xsd:string"/>
<xsd:attribute name="startTimeSubstitutionPattern" type="xsd:string"/>
<xsd:attribute name="duration" type="xsd:string"/>
</xsd:complexType>
</xsd:element>
The addTimeCoverage element indicatest that a THREDDS timeCoverage element should be added to each atomic dataset cataloged by the containing datasetScan element and describes how to determine the time coverage for each datasets in the collection.
Currently, the addTimeCoverage element can only describe one method for determining the time coverage of a dataset. The datasetNameMatchPattern
attribute is used in a regular expression match on the dataset name. If
the match succeeds, a capturing group replacement is performed on the startTimeSubstitutionPattern attribute and the result is the start time string (see the namer element description, above, for more on regular expressions and capturing groups). The time coverage duration is given by the duration attribute.
Example:
<datasetScan name="My Data" path="myData" location="c:/my/data/">
<serviceName>myserver</serviceName>
<addTimeCoverage datasetNameMatchPattern="([0-9]{4})([0-9]{2})([0-9]{2})([0-9]{2})_gfs_211.nc$"
startTimeSubstitutionPattern="$1-$2-$3T$4:00:00"
duration="60 hours" />
</datasetScan>
for the dataset named "2005071812_gfs_211.nc", results in the following timeCoverage element:
<timeCoverage>
<start>2005-07-18T12:00:00</start>
<duration>60 hours</duration>
</timeCoverage>
<xsd:element name="addDatasetSize" />
The addDatasetSize element indicates that file size metadata in the form of a dataSize element should be added to all atomic datasets.
An example is available in the TDS datasetsScan documentation.
<xsd:element name="datasetRoot">
<xsd:complexType>
<xsd:attribute name="path" type="xsd:string" use="required"/>
<xsd:attribute name="location" type="xsd:string" use="required"/>
</xsd:complexType>
</xsd:element>
The datasetRoot element, similar to the datasetScan
element, maps request URLs to dataset collection locations. The
difference is that a datasetRoot does not perform any scans or generate
any catalogs. It simply allows users to specify individual datasets
from the datasetRoot location.
Example:
<datasetRoot path="dsR1" location="C:/data/mydata/" />
...
<dataset name="dataset 1" urlPath="data1.nc" />
| Contact Us Site Map Search Terms and Conditions Privacy Policy Participation Policy | |||||
|
|||||