Hi Tennessee,
Sorry for the confusion. Part of the problem is that we are just getting
our release engineering and versioning figured out. So there are several
different servers, versions, and config file formats being used. We
finally have that pretty well figured out in the upcoming release of the
THREDDS Data Server (TDS). But that doesn't help current installations.
Yes, the idea for the catalog generator is to crawl whatever source of
data you have (local or remote) and generate a catalog. Sorry that
wasn't as easy as it should have been.
Do you have a working config file now from your python script? Let me
know if I can be of any help.
Sorry again for the confusion.
Ethan
Tennessee Leeuwenburg wrote:
I think there may be a fundamental mis-understanding here. The server
is not setup to serve this data. I want to produce the config file
which will enable the server to serve the data. (i.e. catalogConfig.xml).
I have actually now written a python script to do this for me, as it
seemed easier.
I thought that the idea behind the catalog generator wasn't just to
produce catalogs for data resourced from other dods servers, but also
resourced from local disk etc...
Cheers,
-T
Hi Tennessee,
Is the OPeNDAP server already setup to serve this data? What do you
get with the URL http://localhost:8010/thredds/dodsC/catalog.xml? The
current THREDDS server, when setup to serve data via OPeNDAP,
automatically generates a catalog of the data being served (so you
may not need to use CatGen). I think the version from two months ago
should also do that. Could you look at the manifest.mf file in the
thredds.war file you are using and let me know what the "Built-By",
Built-On", Implementation-Title", and "Implementation-Version" values
are?
As for the CatGen stuff, try changing the
resultService@accessPointHeader value to "/data/pymars" (matching the
value in accessPoint) and the datasetFilter@matchPattern value to
".*\.nc$" and for now remove the datasetNamer element.Oh yeah, you'll
probably need to change the resultService@base value to
"http://localhost:8010/thredds/dodsC/".
Hope that helps,
Ethan
Tennessee Leeuwenburg wrote:
Hi Ethan,
I'm not sure I fully grokked what you said to me, so I've just
included my catalog generator file without further modification.
I have data living on disk in /data/pymars/2004/netcdf_anal, and
/data/pymars/2004/netcdf_fore. I would like to set up the catalog
generator to crawl the /data/pymars directory and publish what it
find there -- no requirement for very intelligent structuring at
this stage.
The dods server is running on localhost:8010.
I'm not entirely certain what version is running, but it is whatever
is current on the web page as of about 2 months ago. I look forward
to the new version, and the simpler configuration!
I wasn't sure what I had to do with all that pattern matching stuff,
so I decided to just leave it unchanged from the example, and just
see what happened. I imagine I have to replace the datasetFilter to
accept *.nc, or some other pattern of my choosing. I couldn't work
out if the dataset namer was mandatory or not. I'd really just like
to capture everything, and am happy with the title being the
filename at this stage.
Cheers,
-Tennessee
<?xml version="1.0" encoding="UTF-8"?>
<!-- $Id: catGenConf.exampleLocal.xml,v 1.2 2004/06/03 20:38:07
edavis Exp $ -->
<!-- - Simple example CatalogGenConfig file.
-->
<!DOCTYPE catalog SYSTEM
"http://www.unidata.ucar.edu/projects/THREDDS/xml/CatalogGenConfig.0.5.dtd">
<catalog name="THREDDS CatalogGen test config file" version="0.6">
<dataset name="THREDDS CatalogGen test config file">
<dataset name="NCEP Eta 80km CONUS model data">
<metadata metadataType="CatalogGenConfig">
<catalogGenConfig type="Catalog">
<datasetSource name="Local Disk Data Sets" type="Local"
structure="DirTree"
accessPoint="/data/pymars">
<resultService name="linuxdev" serviceType="DODS"
base="http://localhost:8010/thredds/cataloggen/"
accessPointHeader="/home/tjl/jakarta-5.0.28/content/thredds/cataloggen/"/>
<datasetFilter name="Accept netCDF files only" type="RegExp"
matchPattern="/[0-9][^/]*_eta_211\.nc$"/>
<datasetNamer name="NCEP Eta 80km CONUS model data"
type="RegExp" addLevel="false"
matchPattern="([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])([0-9][0-9])_eta_211.nc$"
substitutePattern="NCEP Eta 80km CONUS $1-$2-$3
$4:00:00 GMT"/>
</datasetSource>
</catalogGenConfig>
</metadata>
</dataset>
<dataset name="NCEP GFS 80km CONUS model data">
<metadata metadataType="CatalogGenConfig">
<catalogGenConfig type="Catalog">
<datasetSource name="model data source" type="Local"
structure="Flat"
accessPoint="./content/thredds/cataloggen/testData/model">
<resultService name="mlode" serviceType="DODS"
base="http://localhost:8080/thredds/cataloggen/"
accessPointHeader="./content/thredds/cataloggen/"/>
<datasetFilter name="Accept netCDF files only" type="RegExp"
matchPattern="/[0-9][^/]*_gfs_211\.nc$"/>
<datasetNamer name="NCEP GFS 80km CONUS model data"
type="RegExp" addLevel="false"
matchPattern="([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])([0-9][0-9])_gfs_211.nc$"
substitutePattern="NCEP GFS 80km CONUS $1-$2-$3
$4:00:00 GMT"/>
</datasetSource>
</catalogGenConfig>
</metadata>
</dataset>
</dataset>
</catalog>
Ethan Davis wrote:
Tennessee Leeuwenburg wrote:
Ethan Davis wrote:
Hi Tennessee,
Did you edit the config.xml file (which sets up the tasks) as
well as
the cat gen config file? I guess you must have if it is showing
up in
the interface. Make sure the period value is not set to zero; if it
is, the task won't be run. Are you getting any messages in the log
files? What version of the server are you running? Is this a
publicly
available server? If so, send me the URL and I'll take a look at the
config files.
Sorry these config file formats are so ugly. We're working on
simplifying and cleaning up the configuration throughout the server.
But for now ...
Well, as long as you're willing to help me, ugly is fine :)
More than willing to help. But I want simpler because it would make
it easier for me to remember what is going on :)
After making that change, the server started to process the various
files. The exampls DODS catalog was generated fine, the example
filesystem catalog and my own filesystem catalog both failed with
similar messages. I've appended the results.
I think I'm failing to understand what exactly the serviceName,
base and
accessPointHeader are actually used for.
As with regular catalogs, I assume one is used for reconstructing the
URL to the file to be resourced, and the other is used for
constructing
the URL to be used in an OpenDAP request, but it's not clear to me
exactly what is happened. I read the documentation, but it was a bit
hand-wavy about the specifics.
The accessPoint is the directory that is to be scanned for data
files. The accessPointHeader is a parent directory of the
accessPoint directory and is used to remove the part of the data
file path that is not to appear in the resulting dataset access
URL. The base value is the URL for the OPeNDAP server that is
serving your data. For instance, if you want to crawl the
/my/data/radar/level3/FTG directory and a resulting dataset access
URL is something like http://.../nph-dods/radar/level3/FTG/file.nc,
you would want something like
<datasetSource name="model data source" type="Local" structure="Flat"
accessPoint="/my/data/radar/level3/FTG">
<resultService name="mlode" serviceType="DODS"
base="http://.../nph-dods/"
accessPointHeader="/my/data/"/>
<datasetFilter ... />
<datasetNamer ... />
</datasetSource>
Does that clear things up at all? If not, feel free to send me your
config file to look at.
Sorry about the documentation. It isn't all that clear and I
haven't put much effort into it since we decided to move to a
simpler config file format. Not sure what's up below with the
example file system dataset. I must have broken something at some
point.
What version of the cat gen servlet (or THREDDS server) are you
running?
Ethan
PS In the new TDS, catalogs for the data it is serving are
automatically generated and the config files are much simpler than
these.
Thanks for your help,
-T
<catalog name="THREDDS CatalogGen test config file" version="0.6">
−
<dataset name="THREDDS CatalogGen test config file">
<service name="linuxdev" serviceType="DODS"
base="http://localhost:8010/thredds/cataloggen/"/>
<service name="mlode" serviceType="DODS"
base="http://localhost:8080/thredds/cataloggen/"/>
−
<dataset name="NCEP Eta 80km CONUS model data">
<dataset name="The DatasetSource "Local Disk Data Sets" could not be
expanded. The accessPointHeader
(/home/tjl/jakarta-5.0.28/content/thredds/cataloggen/) is not a
directory." serviceName="linuxdev"/>
</dataset>
−
<dataset name="NCEP GFS 80km CONUS model data">
<dataset name="The DatasetSource "model data source" could not be
expanded. The accessPointHeader (./content/thredds/cataloggen/) is
not a
directory." serviceName="mlode"/>
</dataset>
</dataset>
</catalog>
--
Ethan R. Davis Telephone: (303) 497-8155
Software Engineer Fax: (303) 497-8690
UCAR Unidata Program Center E-mail: edavis@xxxxxxxx
P.O. Box 3000
Boulder, CO 80307-3000 http://www.unidata.ucar.edu/
---------------------------------------------------------------------------