THREDDS Catalog Primer THREDDS Servers in general, and the TDS in particular, communicate to clients by sending them a THREDDS Catalog (aka Inventory Dataset Catalog) that describes what datasets the server has, and how they can be accessed. A catalog is an XML document that follows the THREDDS Catalog schema.
This primer will describe the client view of the catalog. If you are maintaining a TDS server, you will also need to add other information to the catalog, which is used only by the server and not normally seen by the client.
Here's an example of a simple catalog:
1) <?xml version="1.0" ?> 2) <catalog xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" > 3) <service name="dodsServer" serviceType="OpenDAP" base="/thredds/dodsC/" /> 4) <dataset name="SAGE III Ozone Loss for Oct 31 2006" serviceName="dodsServer" urlPath="sage/110312006.nc"/> 5) </catalog>
with this line-by-line explanation:
When you have many datasets to declare in each catalog, use nested datasets:
<?xml version="1.0" ?> <catalog xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" > <service name="dodsServer" serviceType="OpenDAP" base="/thredds/dodsC/" /> 1) <dataset name="SAGE III Ozone Loss Experiment" > 2) <dataset name="January Averages" serviceName="dodsServer" urlPath="sage/avg/jan.nc"/> 2) <dataset name="February Averages" serviceName="dodsServer" urlPath="sage/avg/feb.nc"/> 2) <dataset name="March Averages" serviceName="dodsServer" urlPath="sage/avg/mar.nc"/> 3) </dataset> </catalog>
You can add any level of nesting you want, eg:
<?xml version="1.0" ?>
<catalog name="Example" xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" >
<service name="dodsServer" serviceType="OpenDAP" base="/thredds/dodsC/" />
<dataset name="SAGE III Ozone Loss Experiment" >
<dataset name="Monthly Averages" >
<dataset name="January Averages" serviceName="dodsServer" urlPath="sage/avg/jan.nc"/>
<dataset name="February Averages" serviceName="dodsServer" urlPath="sage/avg/feb.nc"/>
<dataset name="March Averages" serviceName="dodsServer" urlPath="sage/avg/mar.nc"/>
</dataset>
<dataset name="Daily Flight Data" >
<dataset name="January">
<dataset name="Jan 1, 2001" serviceName="dodsServer" urlPath="sage/daily/20010101.nc"/>
<dataset name="Jan 2, 2001" serviceName="dodsServer" urlPath="sage/daily/20010201.nc"/>
</dataset>
</dataset>
</dataset>
</catalog>
There's a lot of other information that can be optionally added that helps applications and digital libraries know how to "do the right thing" with the dataset. The collectionType attribute is used on collection datasets. The dataType is a simple classification (eg Image, Grid, Point data, etc). The dataFormatType describes what format the data is stored in (eg NetCDF, HDF5, etc) used by a file transfer protocol like FTP. The combination of the naming authority and the ID attribute should form a globally unique identifier for a dataset. In the TDS, it is especially important to add ID attributes to your datasets.
<dataset name="SAGE III Ozone Loss Experiment" collectionType="TimeSeries"> <dataset name="January Averages" serviceName="aggServer" urlPath="sage/avg/jan.nc" authority="unidata.ucar.edu" ID="sage-20938483"> <dataType>Trajectory</dataType> <dataFormatType>NetCDF</dataFormatType> </dataset> </dataset>
The harvest attribute indicates that the dataset is at the right level of granularity to be exported to search systems like Digital Libraries. Elements such as summary, rights, publisher are needed in order to create valid entries for these services. For more details, see Exporting THREDDS Datasets to Digital Libraries. Also see the Catalog Specification as a complete reference.
<dataset name="SAGE III Ozone Loss Experiment" harvest="true"> <contributor role="data manager">John Smith</contributor>
<keyword>Atmospheric Chemistry</keyword>
<publisher>
<name vocabulary="DIF">Community Data Portal, National Center for Atmospheric Research, University Corporation for Atmospheric Research</long_name> <contact url="http://dataportal.ucar.edu" email="cdp@ucar.edu"/>
</publisher>
</dataset>
Rather than declare the same information on each dataset, you can use the metadata element to factor out common information.:
<dataset name="SAGE III Ozone Loss Experiment" > 1) <metadata inherit="true"> 2) <serviceName>dodsServer</serviceName> 2) <dataType>Trajectory</dataType> 2) <dataFormatType>NetCDF</dataFormatType> 2) <authority>unidata.ucar.edu</authority> </metadata> 3) <dataset name="January Averages" urlPath="sage/avg/jan.nc" ID="sage-23487382"/> 3) <dataset name="February Averages" urlPath="sage/avg/feb.nc" ID="sage-63656446"/> 4) <dataset name="Global Averages" urlPath="sage/global.nc" ID="sage-7869700g" dataType="Grid"/> </dataset>
If you use elements from other namespaces, you must declare those namespaces in the catalog element. Currently there are two other namespaces THREDDS libraries will recognize, Dublin Core, and XLink, whose namespaces look like:
<catalog name="MyName"
xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:xlink="http://www.w3.org/1999/xlink" >
Its not obvious, but namespaces are not web addresses, they are just strings that need to be copied exactly as you see them here.
As catalogs get more complicated, you should check that you haven't made any errors. There are three parts to checking:
You can use a THREDDS validation service, such as this one to check all three of these.
You can check well-formedness using an XML tool like XMLSpy. If you also want to check validity in those tools, you will need to declare the catalog schema location like this:
<catalog name="MyName" xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" 1 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 2 xsi:schemaLocation="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0 http://www.unidata.ucar.edu/schemas/thredds/InvCatalog.1.0.xsd"> ... </catalog>
The THREDDS validation service, as well as the catalog library, knows where the schemas are located, so you only need to add these 2 lines if you want to do your own validation.
You will want to study the annotated schema, and the schema document itself.
It can be useful to break up large catalogs into pieces in order to separately maintain each piece. One way to do this is to use build each piece as a separate and logically complete catalog, then create a master catalog using catalog references:
<?xml version="1.0" encoding="UTF-8"?>
<catalog xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" name="Top Catalog" 1) xmlns:xlink="http://www.w3.org/1999/xlink">
2) <dataset name="Realtime data from IDD">
3) <catalogRef xlink:href="idd/models.xml" xlink:title="NCEP Model Data" name="" />
<catalogRef xlink:href="idd/radars.xml" xlink:title="NEXRAD Radar" name="" />
<catalogRef xlink:href="idd/obsData.xml" xlink:title="Station Data" name="" />
<catalogRef xlink:href="idd/satellite.xml" xlink:title="Satellite Data" name="" />
</dataset>
</catalog>
The NetCDF Tools User Interface (aka ToolsUI) can read and display THREDDS catalogs. You can start it from the command line, or launch it from webstart. Use the THREDDS Tab, and click on the
button to navigate to your local catalog file. The catalog will be displayed in a tree widget on the left, and the selected dataset will be shown on the right, for example:
Once you get your catalog working in a TDS, you can enter the TDS URL directly, and view the datasets with the Open buttons.
This document is maintained by Unidata staff.
Please send comments to THREDDS support.