Re: orthogonality (was Re: New attempt)

To make sure I understand you, I am going to add some annotation. let me
know if any of it is wrong.

Also, I am going to use the following definitions for now:

Dataset: the user can select and get a URL.
Collection: group of Datasets.

I will capitalize them to distinguish them from more general useage.

----- Original Message -----
Cc: <thredds@xxxxxxxxxxxxxxxx>
Sent: Wednesday, June 05, 2002 2:56 PM

> I think the word dataset is causing trouble. There are at least three
> potential meanings for this word in the context of THREDDS:
> 1) an entity that is considered as a unit by human beings

Part of a human mental model/ontology.

> 2) an entity that can be operated on as a unit by the THREDDS API

An XML InvCatalog element, and compositions of such.

> 3) an entity that can be operated on as a unit by a data access protocol

A software object accessed/returned/manipulated from the protocol-dependent

> Right now, only the entities described by "access" tags meet all of 1,
> 2, and 3.
> The tags "dataset" and "collection" both describe entities that only
> meet 1 and 2.

I wonder if my annotations are incorrect since I may not understand this. If
I do have them correct, then I would say:

Currently a Dataset XML element is supposed to meet 1, 2, and with help from
an access element, 3. A Collection XML element meets 1, and 2, and the
question is should we find a way to let it also map to 3) when appropriate.

In the case where it is appropriate, ie a Collection has a URL, then its
easy to take it one step further and just erase the distinction between a
Collection and a Dataset. However there are 2 concerns to this approach:

1) When a Collection doesnt have a URL, it cannot meet definition 3). So now
you dont have a word for something that always meets 1, 2, 3.

2) What is the relationship between the contents of a Collection element and
the contents of the Collection's URL? If the relationship is not
particularly well defined or meaningful, you might as well just encode the
Collection's URL as a Dataset. If theres a clear and useful relationship
then it could be a good idea to give the Collection an access element which
makes it clear that that URL has the defined relationship with the rest of
the contents.

> Thus I agree with benno that there is not a very
> meaningful distinction between them (and reconsider my listing of them
> as orthogonal concepts in my previous message).
> I wonder if it would be a good idea to merge these concepts and use a
> less loaded word, say "entry", to refer to an entity that has meaning to
> THREDDS and to end users, but not to a data access protocol, i.e.
> <catalog>
> <service name="X"/>
> <service name="Y"/>
> ...
> <entry name="my_dataset">
>     <metadata name="global-metadata" url="..."/>
>     <access name="global-X-access"/>
>     <entry name="monthly-data">
>       <metadata name="monthly-metadata" url="..."/>
>       <access name="X-with-COARDS" serviceType="X" url="..."/>
>       <access name="X-with-no-COARDS" serviceType="X" url="..."/>
>       <access name="X-flattened-to-2D" serviceType="X" url="http://..."/>
>       <access name="Y" serviceType="Y" url="..."/>
>       ....
>     </entry>
> </entry>

Ok so an "entry" meets meaning 1), while an "access" meets meaning 3) (we
dont need to worry about meaning 2) here).

Some questions:

1) Should we understand that all the access elements within an entry are
different versions of the same dataset? Should we disallow:

     <entry name="monthly-data">
       <metadata name="monthly-metadata" url="..."/>
       <access name="monthly-data from MARS" serviceType="X" url="..."/>
       <access name="monthly-data from VENUS" serviceType="X" url="..."/>

2) is there any relationship between peer elements, in your example

     <access name="global-X-access"/>
     <entry name="monthly-data">

  • 2002 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: