THREDDS API Question

Nathan Potter ndp at opendap.org
Mon Jun 18 14:43:49 MDT 2007


On Jun 14, 2007, at 12:12 PM, Ethan Davis wrote:

> Hi Nathan,
>
> Our development code now has a DataRootHandler.ConfigListener  
> interface that once registered with a DataRootHandler instance will  
> be notified of various configuration events (start, end, catalog,  
> dataset). You can register an instance of an implementation of the  
> ConfigListener by using the DataRootHandler.registerConfigListener 
> (ConfigListener) method.

Ethan,

I am having a few issues with this new version:


- Please make the method DataRootHandler.registerConfigListener()  
public. (It's currently constrained to package members)

- DataRootHandler.reinit() has lost it's public visibility. Is that  
intentional? If so what should I be doing instead? If not, please  
make it public again!

- DataRootHandler depends on the class ucar.unidata.util.DateUtil  
which is not in my current THREDDS lib (netcdf-2.2.18.jar) Is there  
an update for that too?




Thanks,

N





>
> You should be able to use this to keep track of whatever datasets  
> from the config catalogs you want. There are of course some  
> concurrency issues if  reconfiguration might happen at the same  
> time requests are being handled that involve accessing the  
> information you are keeping.
>
> This will be included in our next TDS stable release. I'll put  
> together a TSF only release for you shortly.
>
> Ethan
>
> Nathan Potter wrote:
>>
>>
>>
>> Ethan et al.,
>>
>> After talking with Ethan on the phone today I think I can state  
>> the issue more clearly:
>>
>> The current THREDDS Servlet Framework (TSF) does not allow the  
>> collection/dataset information to be retrieved via the request URL.
>>
>> The API method DataRootHandler.getCatalog(java.lang.String path,  
>> java.net.URI baseURI) expects the "path" parameter to be the path  
>> in the THREDDS catalog to the catalog file. There is no  
>> restriction on the file name of the catalog file. The path in the  
>> THREDDS catalog to the file may be different that the access URL.
>>
>> What this means is that when a servlet receives an access request,  
>> even one that comes from a valid access link in a THREDDS catalog 
>> (.html), the servlet only knows about the request URL, nothing  
>> more. If the servlet needs to get the THREDDS dataset/collection  
>> information (and associated metadata if any) then it has no  
>> recourse but to attempt to search the catalog from the highest  
>> level looking for a dataset with a matching "urlPath" attribute.  
>> This activity may fail if:
>>
>> - The THREDDS catalog employs <catalogRef> elements.
>>
>> - The "urlPath" is not unique within the catalog.
>>
>>
>> I think that the TSF API should be augmented with accessor methods  
>> that allow the DataRootHandler to return InvDataset an InvCatalog  
>> to be retrieved based on information that a servlet has access to  
>> at run time, i.e. data that can be retrieved from the  
>> HttpServletRequest object.
>>
>>
>>
>> Nathan
>>
>>
>>
>>
>>
>> On Jun 4, 2007, at 5:00 PM, Nathan Potter wrote:
>>
>>>
>>> On Jun 4, 2007, at 1:05 PM, Ethan Davis wrote:
>>>
>>>> Hi Nathan,
>>>>
>>>> Can you explain the context for these questions. This is on the  
>>>> server side (in Hyrax)?
>>>
>>>
>>> Yes, server side.
>>>
>>>
>>>>
>>>> Nathan Potter wrote:
>>>>> Greetings,
>>>>>
>>>>> So I am using the THREDDS API in an attempt to get the  
>>>>> <property> elements for a dataset. I've run into a couple of  
>>>>> (possibly related) problems.
>>>>
>>>> Just to clarify our terminology. When you say "THREDDS API" you  
>>>> mean both the thredds.catalog and thredds.servlet packages? I  
>>>> generally split those apart and call the thredds.catalog package  
>>>> the "THREDDS Catalog API" and call the thredds.servlet package  
>>>> the "THREDDS Servlet Framework" (TSF).
>>>>
>>>> [Note: the TSF is probably only useful for those writing servers.]
>>>
>>>
>>> I wasn't distinguishing. But since DataRootHandler is in the TSF  
>>> then that is where I am suggesting an API change.
>>>
>>>
>>>
>>>
>>>>
>>>>> ** 1) I can't get the dataset information without searching.
>>>>>
>>>>> In the HttpServletRequest I have the URL for the dataset, say:
>>>>>
>>>>> http://localhost:8080/opendap/wcs/MODIS/Grid/test.hdf.html
>>>>
>>>> Is this URL for an OPeNDAP HTML response?
>>>
>>>
>>> Right, but the requested response isn't really meaningful in this  
>>> discussion since all I am really after is the THREDDS dataset  
>>> information for the atom/leaf/dataset test.hdf
>>>
>>>
>>>>
>>>> Are you trying to get the property from the THREDDS catalog so  
>>>> you can use it in the OPeNDAP response?
>>>
>>> Well... In truth it's much more complex than that, but since I  
>>> will have to do that too we can roll with that vision for the  
>>> moment.
>>>
>>>
>>>
>>>>
>>>>> In order for me to get THREDDS to divulge the <property>  
>>>>> elements for the dataset I have to:
>>>>>
>>>>> - take the dataset name "wcs/MODIS/Grid/test.hdf.html" and back  
>>>>> track to the
>>>>>   collection name, "wcs/MODIS/Grid/".
>>>>> - ask the DataRootHandler for the InvCatalog for "wcs/MODIS/Grid/"
>>>>> - Ask the InvCatalog for the InvDataset for "wcs/MODIS/Grid/"
>>>>> - Search the child datasets of the "wcs/MODIS/Grid/" InvDataset  
>>>>> for the
>>>>>   one whose name (lexically) matches "wcs/MODIS/Grid/test.hdf.set"
>>>>> - Read the properties of that InvDataset
>>>>>
>>>>> That seems awfully complex. (Of course there may a more  
>>>>> straight forward way that I am not aware of.)
>>>>
>>>> That is about as simple as it gets. Though I would suggest you  
>>>> make sure the THREDDS configuration (TSF) knows about this  
>>>> dataset first by getting the CrawlableDataset that matches the  
>>>> dataset URL:
>>>>       DataRootHandler.getCrawlableDataset("wcs/MODIS/Grid/ 
>>>> test.hdf")
>>>>       // I dropped of the trailing ".html" assuming it was the  
>>>> OPeNDAP dataset URL extension
>>>
>>>
>>> When I tried this I could only get CrawlableDataset objects for  
>>> catalogs that were part of a <datasetScan>
>>>
>>>
>>>
>>>>
>>>> Are you using InvDataset.findDatasetByName( String name) to find  
>>>> the child dataset?
>>>
>>> No.
>>>
>>>>
>>>> Also, depending on how you setup your dataset IDs, you could ask  
>>>> the catalog to find the dataset by ID, like
>>>>
>>>>       cat.findDatasetByID( "wcs/MODIS/Grid/test.hdf")
>>>
>>> Ahhh... I just tried that and it works. So, that greatly  
>>> simplifies that step, thanks!
>>>
>>>
>>>
>>>>
>>>>
>>>>> ** 2) When I ask for a catalog I have to know the name of the  
>>>>> XML file in which it resides.
>>>>>
>>>>> In the above example, when I ask the DataRootHandler for the  
>>>>> InvCatalog I ask for: " wcs/MODIS/Grid/catalog.xml" Which is  
>>>>> all well and good if all of the catalogs are stored in files  
>>>>> called catalog.xml. Essentially this means that anyone  
>>>>> configuring a THREDDS catalog has to create a hierarchy of  
>>>>> directories that mimics the organizatiopn of the collections,  
>>>>> and all of the THREDDS information must be stored in files  
>>>>> called "catalog.xml".
>>>>
>>>> Why do you need to create this hierarchy of directories  
>>>> mimicking the data collection hierarchy? The TSF should keep  
>>>> track of your config catalogs and the automatically generated  
>>>> catalogs.
>>>
>>> Right, but if all of the THREDDS catalog files have the name  
>>> "catalog.xml" they can't all be in the same directory, so they  
>>> have to live in some kind of directory hierarchy - I just figured  
>>> it made sense to mimic the collection organization, but that's  
>>> not necessary.
>>>
>>>
>>>
>>>>
>>>>> THREDDS does not actually require this - I can make a complex  
>>>>> hierarchy of collections by using either a single (complex) top  
>>>>> level catalog.xml file, or a collection of XML files in a  
>>>>> single directory that employ <catalogRef> elements to create  
>>>>> their organizations.
>>>>> However the API breaks down in both cases.
>>>>>
>>>>> If the catalog is composed of a collection of XML files in a  
>>>>> single directory that employ <catalogRef> elements to create  
>>>>> their organizations, then in order to retrieve catalog  
>>>>> information I would have to KNOW how the information was  
>>>>> organized (file names, directory hierarchy , etc.) But I don't  
>>>>> know - since the catalog may be created by a user after compile  
>>>>> time (although THREDDS does know this since it parsed all of  
>>>>> the catalog information at start up) - and I shouldn't have to  
>>>>> know. For me to know would require that I parse the top level  
>>>>> catalog.xml file and build the XML doc tree myself. At which  
>>>>> point it I can get the elusive <property> elements from the XML  
>>>>> doc in memory.
>>>>>
>>>>> If the catalog is composed of a single (complex) top level  
>>>>> catalog.xml file then I would have to know that and just ask  
>>>>> for the top level catalog.
>>>>>
>>>>> (Searching the entire catalog from the top down for my dataset  
>>>>> doesn't seem to work either...)
>>>>
>>>> I'm sorry, I'm having a hard time following here. What are you  
>>>> trying to do and why?
>>>
>>> For any request that is looking for one of the OPeNDAP data  
>>> responses I need to search the THREDDS catalog for the dataset,  
>>> and if found, I need to extract any metadata that may in the  
>>> catalog for that dataset.
>>>
>>>
>>>>
>>>> Is the problem that you may not know if the dataset is contained  
>>>> in a catalog generated because of a datasetScan element or  
>>>> contained directly in one of the THREDDS config catalogs?
>>>
>>> I think that's a separate issue.
>>>
>>>
>>>>
>>>>> All of these methods of writing and organizing catalogs are  
>>>>> legitimate in THREDDS, and users writing THREDDS catalogs would  
>>>>> likely employ one or more of these methods when writing their  
>>>>> catalogs.
>>>>>
>>>>>
>>>>> I propose that the THREDDS API be extended so that one can  
>>>>> simply ask the DataRootHandler for an InvDataset or an  
>>>>> InvCatalog. Like:
>>>>>
>>>>>     InvDataset id = drh.getDataSet("wcs/MODIS/foo.nc");
>>>>>     InvCatalog id = drh.getCatalog("wcs/MODIS/");
>>>>>
>>>>> or possible the InvDataset that represents a collection:
>>>>>
>>>>>     InvDataset id = drh.getDataSet("wcs/MODIS/");
>>>>>
>>>>>
>>>>> If the DataRootHandler doesn't have it, return null.
>>>>>
>>>>>
>>>>> Is that unreasonable?
>>>>
>>>> I'll have to take a closer look at this.
>>>>
>>>> Ethan
>>>>
>>>>>
>>>>> Nathan
>>>>>
>>>>>
>>>>> = 
>>>>> Nathan Potter                        ndp at opendap.org
>>>>> OPeNDAP, Inc.                        541.752.1852
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ================================================================== 
>>>>> ============
>>>>> To unsubscribe thredds, visit:
>>>>> http://www.unidata.ucar.edu/mailing-list-delete-form.html
>>>>> ================================================================== 
>>>>> ============
>>>>
>>>> -- 
>>>> Ethan R. Davis                                Telephone: (303)  
>>>> 497-8155
>>>> Software Engineer                             Fax:       (303)  
>>>> 497-8690
>>>> UCAR Unidata Program Center                   E-mail:     
>>>> edavis at ucar.edu
>>>> P.O. Box 3000
>>>> Boulder, CO  80307-3000                       http:// 
>>>> www.unidata.ucar.edu/
>>>> ------------------------------------------------------------------- 
>>>> --------
>>>>
>>>>
>>>
>>> = 
>>> Nathan Potter                        ndp at opendap.org
>>> OPeNDAP, Inc.                        541.752.1852
>>>
>>>
>>
>> = 
>> Nathan Potter                        ndp at opendap.org
>> OPeNDAP, Inc.                        541.752.1852
>>
>>
>> ===================================================================== 
>> =========
>> To unsubscribe thredds, visit:
>> http://www.unidata.ucar.edu/mailing-list-delete-form.html
>> ===================================================================== 
>> =========
>
> -- 
> Ethan R. Davis                                Telephone: (303)  
> 497-8155
> Software Engineer                             Fax:       (303)  
> 497-8690
> UCAR Unidata Program Center                   E-mail:     
> edavis at ucar.edu
> P.O. Box 3000
> Boulder, CO  80307-3000                       http:// 
> www.unidata.ucar.edu/
> ---------------------------------------------------------------------- 
> -----
>
>

= 
Nathan Potter                        ndp at opendap.org
OPeNDAP, Inc.                        541.752.1852


==============================================================================
To unsubscribe thredds, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================



More information about the Thredds mailing list