Last modified: August 26, 2003
The metadata catalog system for THREDDS servers provides important semantic metadata that enables users to find and use datasets on the server from within "thick" analysis and visualization clients as well as from thin Web-browser based tools.
For example, a user of the prototype, platform-independent, pure-Java Gridded Data Viewer sees the menu of catalog servers and datasets illustrated in the following screen capture.

In this illustration the user has a choice of three catalog servers and on the server at cgd.ucar.edu a test catalog lists a collection of datasets indexed in three different hierarchies -- by Variable, by model, and by Experiment. With newer browsers that can display XML, one can actually view the XML that comprises this particular catalog entry by clicking on the URL: http://www.cgd.ucar.edu/vemap/catalog/test.xml. Note that if you view these XML files with Internet Explorer 5.5, you can expand and contract the hierarchy by clicking on the plus/minus signs in the display.

These XML catalog entries can point to individual data files as well as to entire datasets consisting of many files on several different servers. The XML catalog files can reside anywhere on the network of servers; they do not have to be on the server which holds the data. This flexibility enables a user to create a catalog entry that points to files on a set of servers distributed across the server network, so one can envision catalogs organized around a particular phenomenon or event where the data relating to that event are scattered across several servers.
A fundamentally different kind of metadata is needed in order to analyze or display the data in a meaningful manner once you have found an interesting collection of data. This usage metadata describes the contents of the dataset in terms that analysis and visualization programs understand: which physical parameters are contained in the dataset, how those parameters are stored, the units of the parameters, georeferencing information, and so forth. The metadata representation proposed encompasses this usage metadata as well as the discovery/browse metadata described above.
Since the proposed metadata catalog system is implemented in text-based XML, it lends itself well to indexing and systems designed for the World Wide Web itself. The resulting anchors in this case are simply URLs that point to scientific datasets instead of HTML files. And of course the DODS server technology enables the user to access the dataset using its URL.
While it will be important to create catalog entries interactively using XML authoring tools like XMLSpy, it will not be possible to catalog all the data sets on servers manually. So two forms of automated tools will be needed. One tool will take the form of a crawler which will be able to traverse all the datasets on a server and create the associated XML metadata files. Another approach can be used for real-time data fed in to a server via the Unidata LDM. This situation calls for specialized "decoders" to create the metadata as it is stored in files on the server.
The metadata catalog server approach described here can be integrated with complementary systems under development elsewhere:
Metadata Systems
- the DIMES (DIstributed MEtadata System) at George Mason University
- the ESML (Earth Science Markup Language) at University of Alabama Huntsville
- the Earth system science gazetteer at University of California Santa Barbara
- the Earth system science controlled vocabulary at DLESE
- OpenGIS Webmapping protocols
<< More detail needed here. >>
Analysis and Visualization Tools
- PMEL Ferret
- Unidata MetApps
- University of Wisconsin WeatherWise
- PlanetEarth
<< Need to confirm >>