[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: New Catalog XML Draft



----- Original Message -----
From: "Joe Wielgosz" <address@hidden>
To: "John Caron" <address@hidden>
Cc: <address@hidden>
Sent: Tuesday, May 14, 2002 3:52 PM
Subject: Re: New Catalog XML Draft


> John,
>
> I like it.
>
> My suggestions:
>
> 1) Don't restrict service types to known values. It is certain that
> people will want to add new service types, so the catalog format should
> be extensible in this area.
>
> In order to prevent ambiguity (does "dods" equal "DODS" equals
> "distributed oceanographic data system"?) perhaps these types could
> somehow resolve to the url of the service's home page (e.g.
> DODS->http://unidata.ucar.edu/packages/dods). I don't know the most
> XML-savvy way to do this but perhaps the known mappings can be included
> in the DTD?
>
> 2) Same suggestion for metadata types.

In both these cases, the XML thing to do is to use a URI as a unique
identifier. The options are eg:

1.  xlink:arcrole="http://unidata.ucar.edu/packages/dods";

vs

2.  metadataType="DODS"

Pros of 1: allows services to be added by anyone, URI optionally point to
explanation
Pros of 2: compact, explicitly documents allowable types

>
> 3) How about a catalogNS attribute, that can be added to <catalog>,
> <collection>, or <dataset>?  This would specify a namespace in which
> dataset ID's can be considered unique.
>
> For example, if a THREDDS-crawler found two catalogs with
> catalogNS="http://cola.iges.org/thredds";, and both contained a dataset
> with ID="avn0300", then it could consider these two datasets identical.
>
> This would make it possible to uniquely identify datasets in multiple
> catalogs (in fact across the entire THREDDS web).

really good idea, i'd probably use "datasetNamespace" as tag.

semantics are that if datasetNamespace exists for a dataset, then
datasetNamespace/ID must be globally unique, and the same dataset at
multiple locations should have the same datasetNamespace/ID.

>
> 4) COARDS and CF should go on the list of known metadata types.
>
> 5) it would probably be clearer if the "DatasetDesc" metadataType was
> renamed to "THREDDS".

ok, but if we use URI, the list will just be on some web document, rather
than listed in the DTD


>
> 5) IMHO, the dataType attribute is metadata. Thus it should be part of
> the THREDDS/DatasetDesc metadata file, rather than the catalog itself.

It did occur to me to put it in the DatasetDesc when I made it optional. The
use case is if a client can only read GRID files, you'd like to eliminate
the non-GRID datasets without having to dereference a whole lot of other XML
files. So it could be thought of as a keyword for fast filtering.

As for "metadata", I'd say everything in a Catalog is metadata, but i agree
its more like DatasetDesc metadata than Catalog metadata.


>
> 6) How about allowing inline dataset metadata, via a <metadata> tag?
> Then it would be possible for a site to completely describe its holdings
> in a single file if necessary.

that's a good idea, could make it like <documentation> which can be a
reference and/or inline. In fact, perhaps <documentation> should becode
<metadata> of type "documentation" ?




NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.