Re: Proposed new specification for THREDDSS Catalogs

To: John Caron <caron@xxxxxxxxxxxxxxxx>
Subject: Re: Proposed new specification for THREDDSS Catalogs
From: Roland Schweitzer <Roland.Schweitzer@xxxxxxxx>
Date: Wed, 05 May 2004 13:54:44 -0500

John,

John Caron wrote:

Roland Schweitzer wrote:
John Caron wrote:
Roland Schweitzer wrote:
John,
I have a question about the THREDDS Dataset Inventory Catalog XML.I don't intend this as a criticism, but rather I'm curious aboutthe choices and trade-offs. All of us that are messing around withXML are wrestling with similar issues.
In general, it seems that relationships between elements in the XMLare done via attributes. For example, a <service> element isreferred to in the document via the serviceName attribute in the<dataset> element. And a <dataset> element can be repeated byreferencing the name of another <dataset> element via the aliasattribute.
It seems to me that using this technique then requires that clientcode must be written to follow these connections. By contrast, itseems that the XML community has attempted to create languages(like XPointer) that would "standardize" these sorts ofreferences. Admittedly, even though the XPointer recommendation isa year old, I have not found (m)any implementations in generalpurpose XML software.
Can you please comment on these choices and trade-offs for definingthe internal connections between bit of XML that went intodeveloping the Inventory Catalog?
Thanks,
Roland
Hi Roland:

<excuse> Sorry its taken me so long to answer this </excuse>
Anyway, its not clear that the XPointer spec will become an officialstandard. XPath seems useable though, and i am open to it. Both theserviceName and the alias = dataset ID are more or less the simplecase of XPath using IDs. I think using IDs for datasets is so usefulthat it should probably be required. Which I would do if we could doso and still allow the minimal datasets like the DODS File Server.This ID reference is so simple that even DTDs have it.
So Id say full XPath is a bit of overkill right now, but i am opento using it in the future. Do you forsee any new features that mightneed it?
No excuses needed and no worries.
I don't have any particular features in mind that require full XPath,but my question was directed at the idea that we should get the mostbang for the buck that we can out of the validation of documents.In the new catalog schema, every attribute (except name) is optionalon the dataset element. This means, simple catalogs are possible.But, I think it also means that there is no way from simplyvalidating the XML to guarantee that the alias references areavailable in the document. This is a valid document (according tothe schema and XML Spy):
<?xml version="1.0" encoding="UTF-8"?>
<catalog xmlns="blah blah blah">
   <dataset name="billy" ID="b1"/>
   <dataset name="pointer to nothing" alias="sam"/>
</catalog>
even though the dataset named "pointer to nothing" does just that.
I'll be the first to admit I'm not even sure if what I'm thinkingabout is possible, but I think if there were some way to use the"standard" constructs of XML to enforce the relationship betweendataset elements with alias attributes and the dataset elements towhich they refer it would somehow be "better". I assume when you"validate" a document with your client library you enforce thisrelationship, but it seems it might be "better" if an off the shelfvalidation code (like XML Spy) could enforce this relationship. As Isaid, I don't know if it is possible and I'm trying to figure thisout for XML I'm designing so I'm hoping to benefit from ourdiscussion and your experience designing these catalogs.
Thanks,
Roland
i agree with you on all this; we continue to try to use standardvalidation as much as possible.
on this particular example, we actually now can validate this, (withthe latest version of the schema put out about a week ago and cleverlynot announced to anyone yet ;^) at
 http://www.unidata.ucar.edu/schemas/thredds/InvCatalog.1.0.xsd

the way it works is using the "keyref" constraint:


- <xsd:unique name="datasetID">
 <xsd:selector xpath=".//dataset" />
 <xsd:field xpath="@ID" />
 </xsd:unique>

- <xsd:keyref name="datasetAlias" refer="datasetID">
 <xsd:selector xpath=".//dataset" />
 <xsd:field xpath="@alias" />
 </xsd:keyref>
interestingly enough, it appears that Xerces is not yet handling thisconstraint, but XMLSpy seems to. I havent yet tracked this down, orfound out if i need a more current version of Xerces. (i didnt get achance to try this on your example, let me know if you do...)

I tried XML Spy on my little example and indeed it was found to beinvalid under the new schema. Cool!

IMO, schemas are still bleeding-edge; im hoping they get more maturesoon. theres a lot of sentiment against W3C Schema; i toyed withRelax-NG as an alternative. Just have to keep trying different stufffor now....

I understand. I too have been considering Relax NG because it's"easier" to specify ideas like an element should have either this set ofattributes or this other set of attributes, but not both sets ofattributes. However, nothing is obvious.


Thanks,
Roland

References:
- Proposed new specification for THREDDSS Catalogs
  - From: John Caron
- Re: Proposed new specification for THREDDSS Catalogs
  - From: Roland Schweitzer
- Re: Proposed new specification for THREDDSS Catalogs
  - From: John Caron
- Re: Proposed new specification for THREDDSS Catalogs
  - From: Roland Schweitzer
- Re: Proposed new specification for THREDDSS Catalogs
  - From: John Caron

2004 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the thredds archives: