Re: orthogonality (was Re: New attempt)

----- Original Message -----

> > I agree that orthogonality is a very important motivation. Related
> > motivation is simplicity (small number of concepts) and concept
> > (same meaning wherever a concept is used).
> >
> > The other motivation I have had is:
> >
> >     "Make common cases simple and human-readable"
> >
> > For me, attributes are a bit easier to read then contained elements, so
> > motivation orients me towards the use of attributes, all else being
> As far as I can tell, the trade off is between better readability and
> extensibility. Attributes are obviously more readable. However they are
> less extensible because the value of an attribute must always be plain
> text, whereas the content of a sub-element which starts as plain text
> can have markup added if this becomes desirable down the road.
> "Building Web Services with Java" says:
> "In general, whenever humans design XML documents, you will see more
> frequent use of attributes. This is true eve in data-oriented
> applications. On the other hand, when XML documents are automatically
> 'designed' and generated by applications, you might see a more prevalent
> use of elements."

nicely put.

> >
> > I think its interesting and important what the ontology is, and I notice
> > some subtle differences in your conceptualization and mine (and also
> > Benno's and mine). If we can converge on that a bit more, then some of
> > other issues may be clarified. I think your description above is good,
> > here are some different perspectives on some of it:
> >
> > I would say the basic objects are datasets and services. Therefore, an
> > access element is a binding of a dataset to a service. The base, path,
> > suffix are first attempts to specify the binding. Metadata are
properties of
> > a dataset.
> If these are the basic objects, then where does the actual URL reside?
> As far as I can see, the only place where a URL has meaning is when a
> specific dataset is bound to a specific service. In WSDL this concept (a
> specific instance of a web service definition) is called a "port". In
> THREDDS it is currently expressed as an access tag.
> I would suggest that an access object is really a very basic concept in
> THREDDS since without it, neither datasets nor services are very useful.

Ah yes, the actual URL. One thing to note is that often there is no actual
URL, but rather information from which a protocol-aware handler can
construct actual URLs. This bugs me a little bit, but I dont see a clean way
around it.

OTOH, ignoring that, one might consider the URL as the global identifier of
the dataset (following Russ's comments on REST design). Since it is service
dependent, its really the identifier of the dataset/service combination.
Anyway, I do agree its a basic concept and deserves a prominant place in our

One might argue (I think you have before) that the service is not all that
basic, in the sense that you could get rid of that element by using absolute
URLs, and putting the serviceType into the dataset. I dont think that would
be wrong but I have 2 reasons not to: 1) a service (think "server") is
something many users are very aware of (I want dataset Y from server X), and
2) because often there are many datasets that come from the same server,
factoring out that info is much like normalizing a database.

> > Its true that a catalog is just a collection. Its also a container for
> > service elements, but I dont really like that.
> Maybe a catalog should be able to import an external list of service
> definitions, so that the collection/dataset/access list and the service
> defn's can be physically and conceptually separated when desired.

Hmmm, Im thinking of going the other way, keeping services "close" to where
they are used, rather than factoring them out into some globally accessible
list. But if the latter turns out to be useful we could add it.

> > Even if we keep service
> > elements factored out, it seems better to allow collections to contain
> > service elements, so service elements can be near where they are used.
>  > This
>  > will probably be useful for large synthetic catalogs. OTOH, a catalog
> I would be wary of putting too much effort into pretty formatting,
> especially when it adds complexity at the machine level. The primary use
> of this catalog format is going to be machine-to-machine communication.
> I very much doubt anyone is going to spend much time reading raw XML
> output from THREDDS servers except to debug them. And if they are, you
> might as well just write an XSLT doc that creates genuinely readable

I think this is a good point, and there are lots of XML books that make the
blanket statement that XML should not be considered human readable. But let
me just argue the contrary for a minute.

Theres a reason that we make config files ascii, even though its a bit more
trouble reading and writing them into an application. Being able to really
see whats going on for debugging and validation is hugely important.
Similarly, successful internet protocols (TELNET, FTP, IMAP, NNTP, HTTP,
etc) are typically ascii, with the explicit goal of being able to debug them
from the command line, with no intervening interpreter.

So I agree with you, and I think it is still useful to keep human
readability in sight.


  • 2002 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: