use of the "date" metadata element in THREDDS catalog

Wenli Yang yang at rattler-e.gsfc.nasa.gov
Thu May 31 14:32:23 MDT 2007


Hello,

We are doing a project on ingesting THREDDS catalogs to OGC catalogs 
(Catalog Service for Web, or CSW).  We find that we have to go through an 
entire THREDDS catalog to update an ingested CSW server, because we don't 
know if the THREDDS catalog has been modified before exhaust it.

There is a "date" element in the threddsMetadataGroup.  The element can be 
used to identify the modified (or created, valid, issued, available, 
etc)date/time of a individual and/or collection dataset.  This element is 
very useful not only at individual dataset level but also at data 
collection level.  For example, suppose a data collection A contains 
another collection AA which contains another collection AAA which contains 
datasets a,b,c,and d (i.e., A>AA>AAA>a,b,c,d).  If the "modified" date 
stamp is applied to all the dataset nodes, individual as well as 
collection, a returned user would not need to follow the complete path to 
find out if a new dataset is added/modified/etc in data collection AAA 
and/or another other collections in the hierarchy.

However, it seems that this "date" element is not used widely, if any, at 
the data collection level.  In fact, I randomly browsed some of the data 
paths in Unidata's motherlode catalog 
(http://motherlode.ucar.edu:8080/thredds/catalog.html)and didn't find any 
"Last Modified" information until I got to the final dataset level.

I guess that the reason THREDDS catalog does not show modified date/time at 
collection level is that the catalog is not automatically updated when a 
new dataset is inserted into the database/file system connected to the 
catalog.  Once a user browses down to the catalog, the server will scan the 
immediate child nodes to get all the available datasets/data 
collections.    Thus, a user browsing down the hierarchy will always be 
presented the most currently available datasets although the catalog does 
not update itself upon new datasets being inserted.  The disadvantage of 
the approach is that a user always needs to go to the bottom level to find 
out if any new datasets has been inserted.   Similarly, in order to update 
our CSW catalog, our THREDDStoCSW ingestor will have to scan through an 
entire THREDDS catalog, which can be very large, such as the Unidata catalog.

Any comments/suggestions will be highly appreciated.

Wenli Yang
George Mosaon University




More information about the Thredds mailing list