THREDDS API Question
Nathan Potter
ndp at opendap.org
Mon Jun 18 14:43:49 MDT 2007
On Jun 14, 2007, at 12:12 PM, Ethan Davis wrote:
> Hi Nathan,
>
> Our development code now has a DataRootHandler.ConfigListener
> interface that once registered with a DataRootHandler instance will
> be notified of various configuration events (start, end, catalog,
> dataset). You can register an instance of an implementation of the
> ConfigListener by using the DataRootHandler.registerConfigListener
> (ConfigListener) method.
Ethan,
I am having a few issues with this new version:
- Please make the method DataRootHandler.registerConfigListener()
public. (It's currently constrained to package members)
- DataRootHandler.reinit() has lost it's public visibility. Is that
intentional? If so what should I be doing instead? If not, please
make it public again!
- DataRootHandler depends on the class ucar.unidata.util.DateUtil
which is not in my current THREDDS lib (netcdf-2.2.18.jar) Is there
an update for that too?
Thanks,
N
>
> You should be able to use this to keep track of whatever datasets
> from the config catalogs you want. There are of course some
> concurrency issues if reconfiguration might happen at the same
> time requests are being handled that involve accessing the
> information you are keeping.
>
> This will be included in our next TDS stable release. I'll put
> together a TSF only release for you shortly.
>
> Ethan
>
> Nathan Potter wrote:
>>
>>
>>
>> Ethan et al.,
>>
>> After talking with Ethan on the phone today I think I can state
>> the issue more clearly:
>>
>> The current THREDDS Servlet Framework (TSF) does not allow the
>> collection/dataset information to be retrieved via the request URL.
>>
>> The API method DataRootHandler.getCatalog(java.lang.String path,
>> java.net.URI baseURI) expects the "path" parameter to be the path
>> in the THREDDS catalog to the catalog file. There is no
>> restriction on the file name of the catalog file. The path in the
>> THREDDS catalog to the file may be different that the access URL.
>>
>> What this means is that when a servlet receives an access request,
>> even one that comes from a valid access link in a THREDDS catalog
>> (.html), the servlet only knows about the request URL, nothing
>> more. If the servlet needs to get the THREDDS dataset/collection
>> information (and associated metadata if any) then it has no
>> recourse but to attempt to search the catalog from the highest
>> level looking for a dataset with a matching "urlPath" attribute.
>> This activity may fail if:
>>
>> - The THREDDS catalog employs <catalogRef> elements.
>>
>> - The "urlPath" is not unique within the catalog.
>>
>>
>> I think that the TSF API should be augmented with accessor methods
>> that allow the DataRootHandler to return InvDataset an InvCatalog
>> to be retrieved based on information that a servlet has access to
>> at run time, i.e. data that can be retrieved from the
>> HttpServletRequest object.
>>
>>
>>
>> Nathan
>>
>>
>>
>>
>>
>> On Jun 4, 2007, at 5:00 PM, Nathan Potter wrote:
>>
>>>
>>> On Jun 4, 2007, at 1:05 PM, Ethan Davis wrote:
>>>
>>>> Hi Nathan,
>>>>
>>>> Can you explain the context for these questions. This is on the
>>>> server side (in Hyrax)?
>>>
>>>
>>> Yes, server side.
>>>
>>>
>>>>
>>>> Nathan Potter wrote:
>>>>> Greetings,
>>>>>
>>>>> So I am using the THREDDS API in an attempt to get the
>>>>> <property> elements for a dataset. I've run into a couple of
>>>>> (possibly related) problems.
>>>>
>>>> Just to clarify our terminology. When you say "THREDDS API" you
>>>> mean both the thredds.catalog and thredds.servlet packages? I
>>>> generally split those apart and call the thredds.catalog package
>>>> the "THREDDS Catalog API" and call the thredds.servlet package
>>>> the "THREDDS Servlet Framework" (TSF).
>>>>
>>>> [Note: the TSF is probably only useful for those writing servers.]
>>>
>>>
>>> I wasn't distinguishing. But since DataRootHandler is in the TSF
>>> then that is where I am suggesting an API change.
>>>
>>>
>>>
>>>
>>>>
>>>>> ** 1) I can't get the dataset information without searching.
>>>>>
>>>>> In the HttpServletRequest I have the URL for the dataset, say:
>>>>>
>>>>> http://localhost:8080/opendap/wcs/MODIS/Grid/test.hdf.html
>>>>
>>>> Is this URL for an OPeNDAP HTML response?
>>>
>>>
>>> Right, but the requested response isn't really meaningful in this
>>> discussion since all I am really after is the THREDDS dataset
>>> information for the atom/leaf/dataset test.hdf
>>>
>>>
>>>>
>>>> Are you trying to get the property from the THREDDS catalog so
>>>> you can use it in the OPeNDAP response?
>>>
>>> Well... In truth it's much more complex than that, but since I
>>> will have to do that too we can roll with that vision for the
>>> moment.
>>>
>>>
>>>
>>>>
>>>>> In order for me to get THREDDS to divulge the <property>
>>>>> elements for the dataset I have to:
>>>>>
>>>>> - take the dataset name "wcs/MODIS/Grid/test.hdf.html" and back
>>>>> track to the
>>>>> collection name, "wcs/MODIS/Grid/".
>>>>> - ask the DataRootHandler for the InvCatalog for "wcs/MODIS/Grid/"
>>>>> - Ask the InvCatalog for the InvDataset for "wcs/MODIS/Grid/"
>>>>> - Search the child datasets of the "wcs/MODIS/Grid/" InvDataset
>>>>> for the
>>>>> one whose name (lexically) matches "wcs/MODIS/Grid/test.hdf.set"
>>>>> - Read the properties of that InvDataset
>>>>>
>>>>> That seems awfully complex. (Of course there may a more
>>>>> straight forward way that I am not aware of.)
>>>>
>>>> That is about as simple as it gets. Though I would suggest you
>>>> make sure the THREDDS configuration (TSF) knows about this
>>>> dataset first by getting the CrawlableDataset that matches the
>>>> dataset URL:
>>>> DataRootHandler.getCrawlableDataset("wcs/MODIS/Grid/
>>>> test.hdf")
>>>> // I dropped of the trailing ".html" assuming it was the
>>>> OPeNDAP dataset URL extension
>>>
>>>
>>> When I tried this I could only get CrawlableDataset objects for
>>> catalogs that were part of a <datasetScan>
>>>
>>>
>>>
>>>>
>>>> Are you using InvDataset.findDatasetByName( String name) to find
>>>> the child dataset?
>>>
>>> No.
>>>
>>>>
>>>> Also, depending on how you setup your dataset IDs, you could ask
>>>> the catalog to find the dataset by ID, like
>>>>
>>>> cat.findDatasetByID( "wcs/MODIS/Grid/test.hdf")
>>>
>>> Ahhh... I just tried that and it works. So, that greatly
>>> simplifies that step, thanks!
>>>
>>>
>>>
>>>>
>>>>
>>>>> ** 2) When I ask for a catalog I have to know the name of the
>>>>> XML file in which it resides.
>>>>>
>>>>> In the above example, when I ask the DataRootHandler for the
>>>>> InvCatalog I ask for: " wcs/MODIS/Grid/catalog.xml" Which is
>>>>> all well and good if all of the catalogs are stored in files
>>>>> called catalog.xml. Essentially this means that anyone
>>>>> configuring a THREDDS catalog has to create a hierarchy of
>>>>> directories that mimics the organizatiopn of the collections,
>>>>> and all of the THREDDS information must be stored in files
>>>>> called "catalog.xml".
>>>>
>>>> Why do you need to create this hierarchy of directories
>>>> mimicking the data collection hierarchy? The TSF should keep
>>>> track of your config catalogs and the automatically generated
>>>> catalogs.
>>>
>>> Right, but if all of the THREDDS catalog files have the name
>>> "catalog.xml" they can't all be in the same directory, so they
>>> have to live in some kind of directory hierarchy - I just figured
>>> it made sense to mimic the collection organization, but that's
>>> not necessary.
>>>
>>>
>>>
>>>>
>>>>> THREDDS does not actually require this - I can make a complex
>>>>> hierarchy of collections by using either a single (complex) top
>>>>> level catalog.xml file, or a collection of XML files in a
>>>>> single directory that employ <catalogRef> elements to create
>>>>> their organizations.
>>>>> However the API breaks down in both cases.
>>>>>
>>>>> If the catalog is composed of a collection of XML files in a
>>>>> single directory that employ <catalogRef> elements to create
>>>>> their organizations, then in order to retrieve catalog
>>>>> information I would have to KNOW how the information was
>>>>> organized (file names, directory hierarchy , etc.) But I don't
>>>>> know - since the catalog may be created by a user after compile
>>>>> time (although THREDDS does know this since it parsed all of
>>>>> the catalog information at start up) - and I shouldn't have to
>>>>> know. For me to know would require that I parse the top level
>>>>> catalog.xml file and build the XML doc tree myself. At which
>>>>> point it I can get the elusive <property> elements from the XML
>>>>> doc in memory.
>>>>>
>>>>> If the catalog is composed of a single (complex) top level
>>>>> catalog.xml file then I would have to know that and just ask
>>>>> for the top level catalog.
>>>>>
>>>>> (Searching the entire catalog from the top down for my dataset
>>>>> doesn't seem to work either...)
>>>>
>>>> I'm sorry, I'm having a hard time following here. What are you
>>>> trying to do and why?
>>>
>>> For any request that is looking for one of the OPeNDAP data
>>> responses I need to search the THREDDS catalog for the dataset,
>>> and if found, I need to extract any metadata that may in the
>>> catalog for that dataset.
>>>
>>>
>>>>
>>>> Is the problem that you may not know if the dataset is contained
>>>> in a catalog generated because of a datasetScan element or
>>>> contained directly in one of the THREDDS config catalogs?
>>>
>>> I think that's a separate issue.
>>>
>>>
>>>>
>>>>> All of these methods of writing and organizing catalogs are
>>>>> legitimate in THREDDS, and users writing THREDDS catalogs would
>>>>> likely employ one or more of these methods when writing their
>>>>> catalogs.
>>>>>
>>>>>
>>>>> I propose that the THREDDS API be extended so that one can
>>>>> simply ask the DataRootHandler for an InvDataset or an
>>>>> InvCatalog. Like:
>>>>>
>>>>> InvDataset id = drh.getDataSet("wcs/MODIS/foo.nc");
>>>>> InvCatalog id = drh.getCatalog("wcs/MODIS/");
>>>>>
>>>>> or possible the InvDataset that represents a collection:
>>>>>
>>>>> InvDataset id = drh.getDataSet("wcs/MODIS/");
>>>>>
>>>>>
>>>>> If the DataRootHandler doesn't have it, return null.
>>>>>
>>>>>
>>>>> Is that unreasonable?
>>>>
>>>> I'll have to take a closer look at this.
>>>>
>>>> Ethan
>>>>
>>>>>
>>>>> Nathan
>>>>>
>>>>>
>>>>> =
>>>>> Nathan Potter ndp at opendap.org
>>>>> OPeNDAP, Inc. 541.752.1852
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ==================================================================
>>>>> ============
>>>>> To unsubscribe thredds, visit:
>>>>> http://www.unidata.ucar.edu/mailing-list-delete-form.html
>>>>> ==================================================================
>>>>> ============
>>>>
>>>> --
>>>> Ethan R. Davis Telephone: (303)
>>>> 497-8155
>>>> Software Engineer Fax: (303)
>>>> 497-8690
>>>> UCAR Unidata Program Center E-mail:
>>>> edavis at ucar.edu
>>>> P.O. Box 3000
>>>> Boulder, CO 80307-3000 http://
>>>> www.unidata.ucar.edu/
>>>> -------------------------------------------------------------------
>>>> --------
>>>>
>>>>
>>>
>>> =
>>> Nathan Potter ndp at opendap.org
>>> OPeNDAP, Inc. 541.752.1852
>>>
>>>
>>
>> =
>> Nathan Potter ndp at opendap.org
>> OPeNDAP, Inc. 541.752.1852
>>
>>
>> =====================================================================
>> =========
>> To unsubscribe thredds, visit:
>> http://www.unidata.ucar.edu/mailing-list-delete-form.html
>> =====================================================================
>> =========
>
> --
> Ethan R. Davis Telephone: (303)
> 497-8155
> Software Engineer Fax: (303)
> 497-8690
> UCAR Unidata Program Center E-mail:
> edavis at ucar.edu
> P.O. Box 3000
> Boulder, CO 80307-3000 http://
> www.unidata.ucar.edu/
> ----------------------------------------------------------------------
> -----
>
>
=
Nathan Potter ndp at opendap.org
OPeNDAP, Inc. 541.752.1852
==============================================================================
To unsubscribe thredds, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================
More information about the Thredds
mailing list