[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: latest Catalog XML




----- Original Message -----
From: "Daniel Holloway" <address@hidden>
To: <address@hidden>
Sent: Tuesday, June 18, 2002 10:13 PM
Subject: Re: latest Catalog XML


> Ethan Davis wrote:
>
> > Daniel Holloway wrote:
> > >
> > > ....
> > >
> > >     I've used the DTD listed above to create two examples of what
> > > I believe I could return in an XML-based dods-dir response.  The
> > > catalogs represent simple directory representations of dods-accessible
> > > data; level1.xml is a higher-level directory containing only subdirs
> > > and level2.xml is one of the subdirs containing the dods-accessible
> > > datafiles.
> > >
> > >     Both of these files validate, I get some warnings but mostly
> > > due to style issues, there are no errors.   These examples represent
> > > the minimum information that I could provide, there are ways I
> > > could augment this service to provide some of the additional fields
> > > represented in the DTD but this example is meant to illustrate a
> > > minimum set.

This looks good from my POV. Some minor comments:

1. you dont need xlink:type="simple" in the catalogRef element since that is a fixed in the DTD.
2. as ethan mentioned, you dont need any service elements in level 1 since all you have is catalogRef elements.

more comments below

> > >
> > >     Points to note:
> > >
> > > 1: I'd need to provide 2 service elements, one for
> > > type = DODS, and one for type = Other, where Other represents
> > > the dods-dir service.   In many cases there will be both datafiles
> > > and subdirectories in a file system.   Do I need to have two
> > > service elements?   If so, in the 'Level2.xml' example how do
> > > I indicate which service to use without adding an explicit
> > > access element for each dataset element?
> >
> > In each dataset element you can indicate the service it uses with the
> > serviceName attribute; its value would be the same as the value of the service
> > elements name attribute.
> >
>
> OK

You can specify the default service at the collection element, which keeps things compact.

You can specify a compound service when you always have multiple services available. (I think you should not do this for variations on DODS services (das, dap, ascii, info etc) but im not as clear about the dods-info service. I see that you have decided not to specify that in your later email, which i take as meaning you have come to similar conclusions.)
 
Example:

<service name="Motherlode" serviceType="Compound" base="">
  <service name="MotherlodeFTP" serviceType="FTP" base="ftp://motherlode.ucar.edu/ftp/pub/thredds/"/>
  <service name="MotherlodeDODS" serviceType="DODS" base="http://motherlode.ucar.edu/cgi-bin/dods/"/>
</service>


>
> >
> > > 2:  I use 'collection' as a simple filesystem collection, there
> > > may not be any meaningful relation between the files contained
> > > in the directory other than they're accessible via the DODS DAP.
 
yes, these are logical collections

> > >
> > > 3:  Am I using 'catalogRef' correctly?  The intention is to
> > > indicate another collection level that should only be traversed
> > > at the user's request.
> >
> > catalogRef should reference a THREDDS catalog. Yours is referencing a dods-dir
> > page. I've reworked and attached your examples (with fewer years in level1.xml
> > and fewer example datasets in level2.1985.xml).
>
> Actually I'm trying to reference a THREDDS catalog, that is, my intention
> is that the response from a dods-dir service request would be a valid THREDDS
> catalog and it may look as indicated in the original example.   Currently,
> the dods-dir service returns an html-encoded page, that service would be
> augmented to return a THREDDS catalog.  Granted a THREDDS catalog
> is meant to offer more than a simple filesystem view of the accessible files
> but the catalog DTD supports seems to support this use.
>
> I may be misreading the examples which have been provided in earlier
> messages, but it seems that all of the 'hrefs' in the catalogRef examples
> use explict filenames, but as long as the response to the 'href' returns
> a valid THREDDS catalog representation, why can't the 'href' reference
> a service which returns such a catalog.   The use of 'service' in this
> case is somewhat different than the 'access' service that is encoded
> in the service element, though arguably there is the 'other' and 'catalog'
> types.
>
> If you allow this, then the catalog becomes much simpler for the dods-dir
> case, basically following my initial example.   Actually, I'm not sure how
> the THREDDS API or any other parsing application would know the
> difference in the href so long as it returned a valid xml representation.
 
As ethan realized, the dynamic generation of catalogs such as this is expected and encouraged.

>
> I think the example you provided below using both dataset and
> catalogRef elements to depict the filesystem layout is somewhat
> ambiguous since the relation between these two elements is
> implicit in this use, and not explicit in the DTD, such that any
> API and or parsing application would need to recognize this
> particular relationship.
 
nested datasets is supposed to solve this, although its not completely general since it doesnt allow collection or catalogRef to be nested. I am unconvinced that making this relationship explicit is very important, however.

>
> >
> >
> > 1) I changed the catalogRefs in level1.xml to point to the level2.*.xml files.
>
> A minor point but I'd prefer not to point to individual 'level#.xml' files in
> the references and just point to the service instance for invoking dods-dir
> on the underlying subdirs.  Not sure how to do that now.
 
Not really sure if i understand the question. As long as the href returns a valid XML doc your current catalog should work. Eg: 

  http://dods.gso.uri.edu/dods-3.2/nph-dods/htn_sst_decloud/1992/

now returns html, but if it returned XML catalog it would ok. Since you have control of that link you can put anything you want, and not affect the catalog spec.


>
> >
>
> >
> > 2) I added datasets in level1.xml that give access to the dods-dir pages.
> > 3) I added a serviceName attribute in the dataset elements in the level2.*.xml
> > files.
> >
> > A few alternatives:
> >
> > 1) You could have just one level. Each year could be a collection that contains
> > the individual datasets.
>
> The catalog I'm representing with the dods-dir response is a view
> of the filesystem tree for an arbitrary branch or leaf of the
> filesystem, and for only dods-accessible datafiles at that branch
> or leaf.   I definitely don't want to populate the complete filesystem
> catalog at a high level but want the client app or UI to traverse
> the references to finally get the complete list of datafiles.  Our site
> has approx. 100K datafiles accessible via DODS, so I need to
> limit the amount of information I transmit to a reasonable amount
> unless explictly requested by the client.
 
 
Yes, this is an important use of catalogRef.

>
> >
> >
> > 2) Again, just one level. Each year is a dataset with the dods-dir access and
> > contains sub-datasets for each individual dataset.
> >
> > I'm working on the THREDDS catalog generator tool. Currently I'm working on
> > expanding DODS file servers into THREDDS catalogs. After I get that working I'd
> > like to work on crawling dods-dir pages. So, I'll be wanting to pick your brains
> > on this stuff.
> >
>
> My plan is to augment dods-dir to return an xml-encoded response, hopefully
> a THREDDS catalog representation, then you won't have to crawl dods-dir pages
> just traverse those catalog representations.   We could then discuss how to
> extend the dods-dir service, possibly to read some config.xml or other ancillary
> info sources on the server to provide the additional information needed to
> extend these simple filesystem catalog views into more functional catalog
> representations.
 
sounds great!