[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CSW, THREDDS, GALEON 2



Ben,

The organization of data into collections, both physically and logically, is primarily an implementation issue.  Ron is right that the THREDDS catalog can be implemented in ebRIM profile in CSW with data hierarchies be exposed to the clients.  Because the goal, and more importantly, the resources, of our project is to make data cataloged in THREDDS be searchable through the CSW GMU hace already implemented, we didn't plan to implement a new CSW server that can fully expose THREDDS catalog including its hierarchies.  Our focus was to mapping the THREDDS metedata items to the GMU CSW so that CSW clients can search THREDDS data, through the mapping relationship, in the CSW server.  This means that we are focused on searching data based on client provided criteria, rather than display the data products hierarchically, in the CSW.  We need to include in our logical data model the parent and child relationships between a data product, either a collection or a direct data set, and its parent (for non-root data, i.e., highest collection) and its child (for non-leaf data, i.e., a direct data set).   With such we will be able to know the hierarchy of the catalog.  To expose the hierarchy to a client, the server needs to know to which level the client is expecting to view.  It seems that in THREDDS, the server will always expose the immediate child nodes of a specific node to a client and the client can then click on one of the child nodes to see the next level nodes.  This can be recursively continue until leaf nodes, direct data sets, are exposed.  In CSW, our current implementation is to search all direct data sets and return a subset which meet the search criterion/criteria.  With in parent/child relationship be added, our CSW can also be implemented to search either immediate child or n-level down for a certain node.  However, additional information (parameters/Values) will be needed to specify how a client can let the server know the depth of search and from where the search begins.  We need to investigate if such additional parameters/values are compliant with the ISO19115 profile (it should not be a problem in ebRIM because it's flexibly extensible).  19115 does provide description of metadata hierarchical levels but this parameter/value information for specifying levels of exposure is a different issue.
 
In summary, I agree that the data collection/hierarchy information is useful in many ways.  If time and resources permit, we'll certainly explore the this.  At this moment, we want to focus on our original plan to complete the mapping of THREDDS to 19115 at data set level.


Regards.

Wenli

At 19:06 2006-11-19 -0700, Ben Domenico wrote:
Hello again,

I think I may need to clarify the situation.  The current primary focus of the CSW/THREDDS gateway that GMU is working on is mainly on mapping THREDDS catalog metadata to ISO 19115 so that THREDDS metadata can be made available in an international standard form.  It would be a mistake to divert that project from it's high priority objectives.

On the other hand, the question of how to provide inventory catalogs of "collections" of datasets, and catalogs of those collections -- as THREDDS does -- keeps coming up in many different settings.  It arose in the OGC GALEON interoperability experiment; it came up in discussions at the 3rd Interoperability Workshop on the Automated Harvesting of Data and Metadata last week.

So I sent the message to the THREDDS and GALEON email lists in order to get a wider group thinking about the issue which I think is a key to making all these data services work together.  For those of you who are not familiar with THREDDS catalogs, an example of a heirarchical set of catalogs is available for a variety of real-time data at:

  http://motherlode.ucar.edu:8080/thredds/catalog.html

As you will note as you drill down through the collections, you can get the underlying xml representation of any of any of these catalogs by replacing the .html with .xml in the URL.

From Ron Lake's notes, it sounds like CSW.ebRIM can be used to provide this type of functionality via a standards-based interface.
It's important though that, while we consider the long range goals, we also retain realistic expectations of the current project.

I hope this clarifies rather than confuses the issue.

-- Ben

On 11/18/06, Ron Lake <address@hidden > wrote:

Hi,

 

When this group says CSW, I assume you mean CSW.ebRIM?


Ron

 

From: address@hidden [mailto: address@hidden] On Behalf Of Ben Domenico
Sent: November 18, 2006 1:44 PM
To: Wenli Yang
Cc: Yonsook Enloe; Liping Di; address@hidden; THREDDS community; Unidata GALEON; John Helly
Subject: Re: CSW, THREDDS, GALEON 2

 

Wenli,

 

This issue of "granularity" or heirarchies or collections or groupings of datasets that are alike in some way was one of the issues confronted early in the THREDDS project.  As a result, I believe we have an approach that works reasonably well in the THREDDS Data Server package.   The issue continues to arise in most discussions of data and metadata collections and services.  In fact it was one of the issues discussed at the 3rd Metadata Interoperability Conference I attended last week.  It will be important to confront it in the context of OGC and ISO standards.  The disadvantage of doing it in the WCS context is that one can envision collections that might include Coverages, Features, and Sensor Observations.  For example a collection of all the data related to a specific event such as a severe storm, a flood, a hurricane, and so forth.  One can create THREDDS catlogs for such "case studies."   But it would be good to eventually have a standards-based interface for such collections.  Perhaps the OGC CSW is not well suited to this sort of use at present.  If so, it may be useful to consider suggesting augmentations to CSW.  I believe there is a big advantage in that we already have a working system.

 

I plan to send a copy of this to the THREDDS and GALEON groups as well as to John Helly who convened the Interoperability Workshop last week.

 

Thanks for your careful description of the issues in terms of THREDDS catalogs and OGC CSW..

 

-- Ben

 

On 11/15/06, Wenli Yang <address@hidden> wrote:

Ben,

THREDDS deals with service/data hierarchy nicely.  However, I think that CSW does not provide guidance/standard on how hierarchical service/data should be presented.  When mapping a THREDDS catalog into our CSW, we can track and record the hierarchical relationships among data/catalogReferences and among different levels of catalog references in our database.  We haven't fully investigated how such relationships can be presented to a CSW client (or how a CSW client can request such relationships).  This is certainly a very useful piece of information and deserve further discussion. 

I have not carefully read the WCS hierarchical description part which was primarily provided by Luc.  I think that the primary intention of using hierarchical description in WCS capability was not to let a client actually retrieve this the hierarchy information but was to reduce the duplication of metadata in, and thus the size of, the capabilities document.  Initially, it was hoped that the hierarchical information would allow a client to retrieve a collection of data sets (coverages) from a higher node in the hierarchy but it was decided that this would not be specified.  Of course, a specific server implementation can still provide such capability by declaring a collection of coverages as one single virtual coverage.  For example, a THREDDS service reference containing a time series collection of data sets (individual coverages) for a specific location can be declared as one coverage with a time span covering all the data sets.   In addition, each of the data sets in the collection can, if needed, also be separately declared as a coverage with time range being at a point time (or a smaller time range as compared to that of the collection).

Wenli

At 19:04 2006-11-12 -0500, Yonsook Enloe wrote:


Ben,

 

This is an important topic.  Lets discuss sometime.  The next access-geoscience telecon is scheduled for Dec 20 th.    We could schedule one earlier to just discuss thoughts and ideas on this&..What do you think?


 

Yonsook

 

 

 

-----Original Message-----
From: Ben Domenico [ mailto:address@hidden]
Sent: Tuesday, November 07, 2006 11:19 AM
To: Liping Di; Wenli Yang; Yonsook Enloe
Subject: CSW, THREDDS, GALEON 2

 

Hi all,

You are probably already aware that I think the CSW interface to THREDDS catalogs is a key element of GALEON Phase 2.  Our experience it Phase 1 inidicated that -- at least for the WCS installations used for that interoperability experiment, the WCS GetCapabilities request was inadequate to provide the information available in the hierarchical THREDDS catalogs at sites such as:

   http://motherlode.ucar.edu:8080/thredds/idd/models.html
   http://lead4.unidata.ucar.edu:8080/thredds/catalog/
   http://nomads.ncdc.noaa.gov:8085/thredds/catalog/

I want to introduce these issues to the GALEON team, but I am very much interested in your thoughts on whether and how the ACCESS CSW/THREDDS work should into the GALEON Phase 2 initiative.  Please give me your input on this topic.

I have been holding off on moving forward with Phase 2 until the WCS 1.1 specification is adopted near the end of the year.  But perhaps we could keep the GALEON embers burning in the meantime with a discussion of CSW issues.

I am convinced this is among the most important areas for standards evolution.  Please let me know what you think.

Thanks in advance.

-- Ben