|
|
|||
|
||||
Comments to Ethan Davis or THREDDS mail list
A CatalogGen configuration document is an XML document that describes how to produce a THREDDS catalog by scanning or crawling one or more dataset collections. Each CatalogGen configuration document is a skeleton THREDDS catalog containing one or more metadata elements of type "CatalogGenConfig". Each "CatalogGenConfig" metadata element will be replaced by dataset elements representing the datasets that make up the collection described by that metadata element.
<!ELEMENT catalogGenConfig ( datasetSource )>
<!ATTLIST catalogGenConfig
type (%CatalogGenConfigType;) #REQUIRED
>
<!ENTITY % CatalogGenConfigType "Catalog | Aggregation">
The catalogGenConfig element is the top level element in each "CatalogGenConfig" metadata element. The only value for the type attribute currently supported is "Catalog". So, the value of the
type attribute must be "Catalog". It must contain one and only one datasetSource element. For example:
<catGen:catalogGenConfig type="Catalog">
<catGen:datasetSource name="model data source" type="Local"
structure="Flat"
accessPoint="./test/thredds/cataloggen/testData/model">
...
</catGen:datasetSource>
</catGen:catalogGenConfig>
NOTE: A second value of "Aggregation" is defined for the type attribute but is not currently supported. This is a placeholder for when/if the Catalog Generator is expanded to produce configuration files for the DODS Aggregation Server.
<!ELEMENT datasetSource ( resultService, datasetFilter*, datasetNamer*)>
<!ATTLIST datasetSource
name CDATA #REQUIRED
type (%DatasetSourceType;) #REQUIRED
structure (%DatasetSourceStructure;) #REQUIRED
accessPoint CDATA #REQUIRED
>
<!ENTITY % DatasetSourceType "Local | DodsDir | DodsFileServer | GrADSDataServer">
<!ENTITY % DatasetSourceStructure "Flat | DirTree">
The datasetSource element describes the source of a dataset
collection and how to crawl the collection and create a THREDDS catalog
for the collection's datasets. The name of the dataset source is given
by the name attribute. The type attribute describes
the kind of dataset source being described. The possible values are
"Local", for a data collection on local disk and "DodsDir", for a data
collection from a remote OPeNDAP/DODS server. The value of the structure
attribute indicates whether any hierarchical directory structure of the
dataset source should be duplicated in the resulting catalog
("DirTree") or flattened ("Flat"). The value of the accessPoint attribute is the directory path or URL to the location of the desired datasets. Each datasetSource element must contain one, and only one, resultService element and may contain one or more datasetFilter
elements followed by one or more datasetNamer elements.
NOTE: The two values "DodsFileServer" and "GrADSDataServer" are defined as types but are not currently supported by the catalog generation software.
<!ELEMENT resultService EMPTY>
<!ATTLIST resultService
name CDATA #REQUIRED
serviceType (%ServiceType;) #REQUIRED
base CDATA #REQUIRED
suffix CDATA #IMPLIED
accessPointHeader CDATA #REQUIRED
>
A resultService element provides the details about the service that is serving the datasets from the dataset source. All the dataset elements in the resulting catalog that were added from the dataset source will reference the service described by this resultService element. The name, serviceType, base, and suffix attributes are the attributes of the THREDDS catalog service element (see the THREDDS Inventory Catalog specification). All these attributes are required except for the suffix attribute. The value of the accessPointHeader attribute is used to remove the local part of a datasets path that is not seen by a service. For example, say you have a DODS server serving the data file "/home/htdocs/data/myFile.nc" and "/home/htdocs" is your web servers DocRoot. You could write:
<catGen:datasetSource type="Local" structure="Flat"
accessPoint="/home/htdocs/data">
<catGen:resultService name="myService" serviceType="DODS"
base="http://localhost/cgi-bin/nph-dods/"
accessPointHeader="/home/htdocs" />
</catGen:datasetSource>
The data file would be found at "/home/htdocs/data/myFile.nc" and the accessPointHeader value would be removed from the start of the path resulting in the following dataset element:
<dataset name="" serviceName="myService" urlPath="data/myFile.nc />
<!ELEMENT datasetFilter EMPTY>
<!ATTLIST datasetFilter
name CDATA #REQUIRED
type (%DatasetFilterType;) #REQUIRED
matchPattern CDATA #IMPLIED
matchPatternTarget CDATA #IMPLIED
applyToCollectionDataset (%TrueFalse;) false
applyToAtomicDataset (%TrueFalse;) true
invertMatchMeaning (%TrueFalse;) false
>
<!ENTITY % DatasetFilterType "RegExp">
A datasetFilter element specifies a scheme for filtering
datasets. The datasetFilter elements are applied to a resource to determine if it will be added to the dataset collection. If none of the datasetFilter
elements accept a given resource it is not added to the collection.
This applies to collection (directory) level resources as well. For
instance, if there are no filters that apply to collection datasets,
the crawling of the datasetSource will not go beyond the top-level.
The name attribute gives the name of the filter. The value of the type
attribute must be "RegExp" and indicates that a regular expression is
used on the resource to check for a match. The match pattern is given
by the value of the matchPattern attribute. The target of the match pattern is given by the matchPatternTargetattribute.
(Currently, indicates which attribute of the dataset element
the match is to run against, for now either the "name" or "urlPath"
attribute. In the future, will also be able to indicate a part of the
accessible dataset, e.g., an attribute in a netCDF file.) Whether a
filter will be applied to atomic datasets and/or collection datasets is
determined by the applyToCollectionDataset and applyToAtomicDataset attributes. The default is to apply only to atomic datasets (leaf-node datasets).
The invertMatchMeaning attribute reverses the meaning of a
filter. Normally, if a dataset matches a filter it is accepted as part
of the datasetSource collection. However, if the invertMatchMeaning
attribute is set to "true", if a dataset matches a filter it is not
accepted. This attribute should be used with some care; unless a match
is well designed, setting this attribute to "true" can filter out a
large number of datasets.
<!ELEMENT datasetNamer EMPTY>
<!ATTLIST datasetNamer
name CDATA #REQUIRED
addLevel (%TrueFalse;) #REQUIRED
type (%DatasetNamerType;) #REQUIRED
matchPattern CDATA #IMPLIED
substitutePattern CDATA #IMPLIED
attribContainer CDATA #IMPLIED
attribName CDATA #IMPLIED
>
<!ENTITY % DatasetNamerType "RegExp | DodsAttrib">
<!ENTITY % TrueFalse "true | false">
A datasetNamer element specifies a scheme for naming datasets. The datasetNamer elements, in document order, are applied to each dataset until one can be used to name the dataset. If none of the datasetNamer
elements can name a dataset, that dataset is removed from the dataset
collection. (NOTE: This means that the dataset namers are also dataset
filters.)
The name attribute provides the name of the datasetNamer element. When the addLevel attribute is "true", all dataset elements named by the datasetNamer are enclosed in a containing dataset element. The name of the containing dataset element is the name of the datasetNamer element. When the addLevel attribute is set to "false", the dataset elements are added directly to the parent dataset without a new containing dataset element. The value of the type attribute can be either "RegExp" or "DodsAttrib". A "RegExp" type means that a regular expression (the value of the matchPattern attribute) is used to determine if the datasetNamer will be used to name a given dataset. If the regular expression matches the urlPath of the dataset, values found in the match are substituted in the substitution pattern string (the value of the substitutePattern
attribute) and the resulting string is used to name the dataset. A type
of "DodsAttrib" means that the dataset to be named is checked for a
variable (or OPeNDAP/DODS attribute container) with the name given in
the attribContainer attribute and then that variable is checked for a variable attribute with the name given by the attribName attribute. If the variable attribute exists, its value is used to name the resulting dataset element.
| Contact Us Site Map Search Terms and Conditions Privacy Policy Participation Policy | ||||||
|
||||||