|
|
|||
|
||||
Dataset Inventory Catalog Primer Here's an example of a very simple catalog:
1 <?xml version="1.0" ?> 2 <catalog xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" > 3 <service name="aggServer" serviceType="DODS" base="http://acd.ucar.edu/dodsC/" /> 4 <dataset name="SAGE III Ozone Loss" serviceName="aggServer" urlPath="sage.nc"/> 5 </catalog>
with this line-by-line explanation:
Usually you have many datasets to declare in each catalog, which you do using nested datasets:
<?xml version="1.0" ?> <catalog name="Example" xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" > <service name="aggServer" base="http://acd.ucar.edu/dodsC/" serviceType="DODS" /> 1 <dataset name="SAGE III Ozone Loss Experiment" > 2 <dataset name="January Averages" serviceName="aggServer" urlPath="sage/avg/jan.nc"/> 2 <dataset name="February Averages" serviceName="aggServer" urlPath="sage/avg/feb.nc"/> 2 <dataset name="March Averages" serviceName="aggServer" urlPath="sage/avg/mar.nc"/> 3 </dataset> </catalog>
You can add any level of nesting you want, eg:
<?xml version="1.0" ?>
<catalog name="Example" xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" >
<service name="aggServer" base="http://acd.ucar.edu/dodsC/" serviceType="DODS" />
<dataset name="SAGE III Ozone Loss Experiment" >
<dataset name="Monthly Averages" >
<dataset name="January Averages" serviceName="aggServer" urlPath="sage/avg/jan.nc"/>
<dataset name="February Averages" serviceName="aggServer" urlPath="sage/avg/feb.nc"/>
<dataset name="March Averages" serviceName="aggServer" urlPath="sage/avg/mar.nc"/>
</dataset>
<dataset name="Daily Flight Data" >
<dataset name="January">
<dataset name="Jan 1, 2001" serviceName="aggServer" urlPath="sage/daily/20010101.nc"/>
<dataset name="Jan 2, 2001" serviceName="aggServer" urlPath="sage/daily/20010201.nc"/>
</dataset>
</dataset>
</dataset>
</catalog>
There's a lot of other information that can be optionally added that helps applications and digital libraries know how to "do the right thing" with the dataset. The collectionType attribute is used on collection datasets. The dataType is a simple classification (eg Image, Grid, Point data, etc). The dataFormatType describes what format the data is stored in (eg NetCDF, HDF5, etc) used by a file transfer protocol like FTP. The combination of the naming authority and the ID attribute should form a globally unglue identifier for a dataset.
<dataset name="SAGE III Ozone Loss Experiment" collectionType="TimeSeries"> <dataset name="January Averages" serviceName="aggServer" urlPath="sage/avg/jan.nc" authority="unidata.ucar.edu" ID="sage-20938483"> <dataType>Trajectory</dataType> <dataFormatType>NetCDF</dataFormatType> </dataset> </dataset>
The harvest attribute indicates that the dataset is at the right level of granularity to be exported to search systems like Digital Libraries. Elements such as summary, rights, publisher are needed in order to create valid entries for these services. For more details, see Exporting THREDDS Datasets to Digital Libraries. Also see the Catalog Specification as a complete reference.
<dataset name="SAGE III Ozone Loss Experiment" harvest="true"> <contributor role="data manager">John Smith</contributor>
<keyword>Atmospheric Chemistry</keyword>
<publisher>
<name vocabulary="DIF">Community Data Portal, National Center for Atmospheric Research, University Corporation for Atmospheric Research</long_name> <contact url="http://dataportal.ucar.edu" email="cdp@ucar.edu"/>
</publisher>
</dataset>
Rather than declare the same information on each dataset, you can use the metadata element to factor out common information.:
<dataset name="SAGE III Ozone Loss Experiment" > 1 <metadata inherit="true"> 2 <serviceName>aggServer</serviceName> 2 <dataType>Trajectory</dataType> 2 <dataFormatType>NetCDF</dataFormatType> 2 <authority>unidata.ucar.edu</authority> </metadata> 3 <dataset name="January Averages" urlPath="sage/avg/jan.nc" ID="sage-23487382"/> 3 <dataset name="February Averages" urlPath="sage/avg/feb.nc" ID="sage-63656446"/> 4 <dataset name="Global Averages" urlPath="sage/global.nc" ID="sage-7869700g" dataType="Grid"/> </dataset>
If you use elements from other namespaces, you must declare those namespaces in the catalog element. Currently there are two other namespaces THREDDS libraries will recognize, Dublin Core, and XLink, whose namespaces look like:
<catalog name="MyName"
xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:xlink="http://www.w3.org/1999/xlink" >
Its not obvious, but namespaces are not web addresses, they are just strings that need to be copied exactly as you see them here.
As catalogs get more complicated, you should check that you haven't made any errors. There are three parts to checking:
You can use any THREDDS validation service, such as this one to check all three of these.
You can check well-formedness using an XML tool like XMLSpy; in order to check validity in those tools you will need to declare the catalog schema location like this:
<catalog name="MyName" xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" 1 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 2 xsi:schemaLocation="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0 http://www.unidata.ucar.edu/schemas/thredds/InvCatalog.1.0.xsd"> ... </catalog>
The THREDDS validation service, as well as the catalog library, knows where the schemas are located, so you only need to add these 2 lines if you want to do your own validation.
It can be useful to break up large catalogs into pieces in order to separately maintain each piece. One way to do this is to use build each piece as a separate and logically complete catalog, then create a master catalog using catalog references:
<catalog name="master" xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" xmlns:xlink="http://www.w3.org/1999/xlink" > <dataset name="List of THREDDS catalogs"> <catalogRef xlink:title="IRI/LDEO Climate Data Library" xlink:href="http://iridl.ldeo.columbia.edu/SOURCES/thredds.xml"/> <catalogRef xlink:title="NCAR Data Portal" xlink:href="http://dataportal.ucar.edu/metadata/ucar.thredds"/> <catalogRef xlink:title="NOAA-CIRES Climate Diagnostics Center" xlink:href="http://www.cdc.noaa.gov/THREDDS/catalog.xml"/> <catalogRef xlink:title="Unidata THREDDS-IDD Server" xlink:href="http://motherlode.ucar.edu:8080/thredds/catalog.xml"/> <catalogRef xlink:title="University of Alabama Huntsville POND server" xlink:href="http://pond.itsc.uah.edu/catalog/thredds/pond_cat.xml"/> </dataset>
</catalog>
In this example we have several catalogRef elements, each with a link to an external catalog, using the xlink:href attribute. The catalogRef should be thought of as a dataset, whose contents are the contents of the external catalog. The xlink:title is used as the name of the dataset. Notice that we must declare the xlink namespace in the catalog element.
CVS date: $Date: 2003/12/24 00:00:04 $
| Contact Us Site Map Search Terms and Conditions Privacy Policy Participation Policy | ||||||
|
||||||