[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: orthogonality (was Re: New attempt)

I think the word dataset is causing trouble. There are at least three potential meanings for this word in the context of THREDDS:

1) an entity that is considered as a unit by human beings

2) an entity that can be operated on as a unit by the THREDDS API

3) an entity that can be operated on as a unit by a data access protocol

Right now, only the entities described by "access" tags meet all of 1, 2, and 3.

The tags "dataset" and "collection" both describe entities that only meet 1 and 2. Thus I agree with benno that there is not a very meaningful distinction between them (and reconsider my listing of them as orthogonal concepts in my previous message).

I wonder if it would be a good idea to merge these concepts and use a less loaded word, say "entry", to refer to an entity that has meaning to THREDDS and to end users, but not to a data access protocol, i.e.

<service name="X"/>
<service name="Y"/>

<entry name="my_dataset">

   <metadata name="global-metadata" url="..."/>
   <access name="global-X-access"/>

   <entry name="monthly-data">
     <metadata name="monthly-metadata" url="..."/>
     <access name="X-with-COARDS" serviceType="X" url="..."/>
     <access name="X-with-no-COARDS" serviceType="X" url="..."/>
     <access name="X-flattened-to-2D" serviceType="X" url="http://..."/>
     <access name="Y" serviceType="Y" url="..."/>


- Joe

Daniel Holloway wrote:

Benno Blumenthal wrote:

John Caron wrote:

Much harder question is the distinction between a dataset and a
since a dataset is a collection of data. I have conceptualized it as

follows: a dataset is something that can be selected, and then it is

processed in a protocol-dependent way. A collection is a
protocol-independent mechanism for grouping datasets.

I think this is what is getting us into trouble.    The concept of a
dataset should be independent of the services available for it:  a
dataset served from two different servers could very well have
different services/protocols available, depending on the server.  (the
aggregation server converts collections to datasets, for example).
Yet from the THREDDS/educational point of view, it is the same object.

I agree with this as well. I've been trying to reconcile how a catalog might look for a particular multifile 'dataset' which has both WMS and DODS access available for it. For WMS (for multifile) datasets the access point would be at the collection level, while for 'non-aggregated' datasets the DODS access would be lower than the collection level, at the THREDDS dataset level. It seems that the concept of a dataset resides more at the collection level, maybe the service access binding is too tightly coupled to the dataset concept in the current draft.



-- Dr. M. Benno Blumenthal address@hidden International Research Institute for climate prediction Lamont-Doherty Earth Observatory of Columbia University Palisades NY 10964-8000 (845) 680-4450

Joe Wielgosz
address@hidden / (707)826-2631
Center for Ocean-Land-Atmosphere Studies (COLA)
Institute for Global Environment and Society (IGES)

NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.