Re: DODS workshop observations

To: Ben Domenico <ben@xxxxxxxxxxxxxxxx>
Subject: Re: DODS workshop observations
From: Peter Cornillon <pcornillon@xxxxxxxxxxx>
Date: Fri, 18 Jan 2002 22:55:59 -0500

Hi Ben,

Thanks for bringing these issues up. I will be sending out a message to
DODS-tech that will try to summarize all of the points that came up at
the meeting, some of which you mention below. I assume that there will
be debate of these on the DODS-tech list when I do that. Unfortunately
I am tied up in meetings next week and will likely not get such a message
out until the following week. I do however have some quick comments re
your message. They are in-line below.

Peter

Ben Domenico wrote:
> 
> Hi,
> 
> After the DODS meetings last week and a few brief conversations at the AMS
> meetings this week, I thought it would be useful to summarize the issues
> that came up at the DODS meetings that I feel are important from my own
> (admittedly limited) THREDDS perspective.
> 
> When I get a chance, I'll try to capture this on a web page with all the
> relevant links, etc. but I wanted to get it out for discussion (especially
> for corrections by others who were at the DODS meetings) before I let it
> fall through the cracks.
> 
> Have a nice MLK weekend.
> 
> -- Ben
> 
> ======================================================
> 
> Granularity:
> 
> Under this heading, I include the discussions regarding what comprises a
> dataset, what's an aggregation, what's a catalog, a collection, etc. and
> how these relate to files, data objects within files, inventories, lists,
> directories, etc.   I came away from the meetings with the sense that there
> are clear definitions for only a few of these.   Within THREDDS, we need to
> come up with some working definitions that allow us to work with the data
> heirarchy in a systematic fashion.   This is somewhat complicated by the
> fact that the Digital Library community uses some of the terms, e.g.,  the
> term "collection" in its own fashion.

Yes, I agree that it would be nice to come up with a set of definitions
that are consistent across discplines. This may be hard. I have a second
crack at the stuff that I presented at the meeting that tries to address
both the computer science view and the earth science view of some of these
terms. I will attach the html page for these to this document. I wanted
to present these at the meeting, but never found the time, so they have
not been veted by the DODS community yet.

> There is a related THREDDS issue that was not discussed much at the DODS
> meetings, namely, that we envision third-party metadata contributions in
> the form of "catalogs" that reference files on multiple data servers.   But
> it means that a given dataset or file can be a member of many heirarchies.

We actually plan to use the Aggregation Server in exactly this way - the
historical archive of SST fields for the western North Atlantic will be
served from URI and the past year's worth of data from NMFS. We will
use the AS to present this as one dataset.

> Metadata Schemas:
> 
> The DODS DDS (Data Descriptor Structure) and DAS (Data Attribute Structure)
> will not be sufficient for THREDDS.  

Is this necessarily true. The way that I see the das is that it is a container.
It does not impose a particular metadata standard on its contents. It can be
used for a free text description of a dataset or it can contain a particular
standard. In fact we have discussed the possiblility of it containing more 
than one metadata representation of the data. Am I missing something here?
If so, we might want to look at modifying the DAP to address your concerns.

> We have to determine how THREDDS fits
> in with externally defined "standards" such as those of ISO, FGDC, OpenGIS,
> GCMD, Dublin Core, ESML, etc.   Recently we learned of another in the area
> of software metadata -- BIDM (basic interoperability data model.)  Our data
> provider sites are required to conform to some of these standards and the
> DL community is adopting Dublin Core with some extensions.
> 
> Metadata Creation Tools:
> 
> These  are needed  in the form of crawlers, scanners, and tools to aid
> human input.   This includes hybrid tools where some of the metadata common
> to many datastts is input by hand one time and is then combined
> automatically with metadata specific to individual datasets or files.  It
> is important that such tools be able to traverse data holdings where the
> metadata (and perhaps the datasets themselves) are held in databases and
> generated on the fly as needed.  Some of this work is going on in DODS,
> some in the DL community, and some at Unidata.   So this is one where
> coordination of efforts is needed.

Agreed.

> Metadata Presentation Tools:
> 
> Several approaches to making metadata available were discussed at the
> meeting:  DBMS systems, LDAP, simply directory/file systems, full text
> indexing facilities.  As noted above, it's important for metadata
> "harvesting" tools to be able to "traverse" all the metadata at a site --
> even though it is made available in different ways.
> 
> Third-party Metadata Catalog Servers and the DODS Auxiliary Information
> Servers:
> 
> I believe these two concepts can be closely related.   Whereas the AIS is
> currently viewed as a way of adding a "delta" of metadata to the main
> metadata source at the data providers site, 

I would take the AIS a step farther here. It could well be used (and will
likely be used) for not just a delta of metadata, but for a complete
metadata description of a dataset that has little to no metadata at
the data host site. The idea is to design the AIS so that it is sufficiently
flexible to allow for a delta as well as a wholesale addition of a data
set description.

> the concecpt could be extended
> to include sites which serve catalogs of metadata organized in a completely
> different fashion.   For example, some of the catalogs might point to
> collections of datasets on different servers that illustrate different
> scientific concepts or collections of datasets on different servers that
> relate to certain events: hurricanes, major storms, floods,etc.

This is seems like something that does go beyond our view of the AIS.
The idea of the AIS is to provide the metadata needed for Level 3
interoperability at the data level - machine-to-machine interoperability
with semantic meaning. The AIS will supplement the dds or the das (and
may add independent variables as well) and it will do this as part of 
the basic DAP. The last point is key. This is what allows for the 
interoperability. If the idea is extended to include descriptions of
collections I think that it will confuse the development issues 
related to the AIS. Maybe the name, AIS, is misleading in this regard?

Peter
-- 
 Peter Cornillon                                                       
  Graduate School of Oceanography  - Telephone: (401) 874-6283         
   University of Rhode Island      -       FAX: (401) 874-6728         
    Narragansett RI 02882  USA     -  Internet: pcornillon@xxxxxxxxxxx

<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
<head>
   <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
   <meta name="GENERATOR" content="Mozilla/4.72 [en] (X11; U; Linux 
2.2.14-6.1.1 i686) [Netscape]">
   <title>Oceanology International </title>
</head>

<BODY bgcolor="#eeeeee" vlink=blue>

<table width="100%" cellpadding=0 cellspacing=2><tr>
<td bgcolor="white"><a HREF="datasets.html"><img 
src="../dods-general/next.gif"></a></td>
<td bgcolor="white"><a HREF="PI-Overview-outline.html"><img 
src="../dods-general/up.gif"></a></td>
<td bgcolor="white"><a HREF="why-opendap.html"><img 
src="../dods-general/previous.gif"></a></td>
<td align="center" bgcolor="#99ccff" width="100%"><font 
size=7><b>Defintions</b></td></table>
<p><br><br>

<font size=+3>
<ul>
<li> data objects, data collections and datasets<br><br>

<li> data system levels<br><br>

<li> syntactic and semantic metadata <br><br>

<li> levels of data system interoperability <br><br>
</ul>

</body>
</html>

References:
- DODS workshop observations
  - From: Ben Domenico

2002 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the thredds archives: