I think the word dataset is causing trouble. There are at least three
potential meanings for this word in the context of THREDDS:
1) an entity that is considered as a unit by human beings
2) an entity that can be operated on as a unit by the THREDDS API
3) an entity that can be operated on as a unit by a data access protocol
Right now, only the entities described by "access" tags meet all of 1,
2, and 3.
The tags "dataset" and "collection" both describe entities that only
meet 1 and 2. Thus I agree with benno that there is not a very
meaningful distinction between them (and reconsider my listing of them
as orthogonal concepts in my previous message).
I wonder if it would be a good idea to merge these concepts and use a
less loaded word, say "entry", to refer to an entity that has meaning to
THREDDS and to end users, but not to a data access protocol, i.e.
<catalog>
<service name="X"/>
<service name="Y"/>
...
<entry name="my_dataset">
<metadata name="global-metadata" url="..."/>
<access name="global-X-access"/>
<entry name="monthly-data">
<metadata name="monthly-metadata" url="..."/>
<access name="X-with-COARDS" serviceType="X" url="..."/>
<access name="X-with-no-COARDS" serviceType="X" url="..."/>
<access name="X-flattened-to-2D" serviceType="X" url="http://..."/>
<access name="Y" serviceType="Y" url="..."/>
....
</entry>
</entry>
- Joe
Daniel Holloway wrote:
Benno Blumenthal wrote:
John Caron wrote:
Much harder question is the distinction between a dataset and a
collection,
since a dataset is a collection of data. I have conceptualized it as
follows: a dataset is something that can be selected, and then it is
processed in a protocol-dependent way. A collection is a
protocol-independent mechanism for grouping datasets.
I think this is what is getting us into trouble. The concept of a
dataset should be independent of the services available for it: a
dataset served from two different servers could very well have
different services/protocols available, depending on the server. (the
aggregation server converts collections to datasets, for example).
Yet from the THREDDS/educational point of view, it is the same object.
I agree with this as well. I've been trying to reconcile how a catalog
might look for a
particular multifile 'dataset' which has both WMS and DODS access
available for it. For WMS (for multifile) datasets the access point
would be at the
collection level, while for 'non-aggregated' datasets the DODS access
would
be lower than the collection level, at the THREDDS dataset level. It
seems that
the concept of a dataset resides more at the collection level, maybe the
service
access binding is too tightly coupled to the dataset concept in the
current draft.
Dan
Benno
--
Dr. M. Benno Blumenthal benno@xxxxxxxxxxxxxxxx
International Research Institute for climate prediction
Lamont-Doherty Earth Observatory of Columbia University
Palisades NY 10964-8000 (845) 680-4450
--
Joe Wielgosz
joew@xxxxxxxxxxxxx / (707)826-2631
---------------------------------------------------
Center for Ocean-Land-Atmosphere Studies (COLA)
Institute for Global Environment and Society (IGES)
http://www.iges.org