The featureCollection element is a way to tell the TDS to serve collections of CDM Feature Datasets. Currently this is used mostly for gridded data whose time and spatial coordinates are recognized by the CDM software stack. This allows the TDS to automatically create logical datasets composed of collections of files, particularly gridded model data output, called Forecast Model Run Collections (FMRC).
A Forecast Model Run Collection is a collection of forecast model output. A special kind of aggregation, called an FMRC Aggregation, creates a dataset that has two time coordinates, called the run time and the forecast time. This dataset can then be sliced up in various ways to create 1D time views of the model output. See this poster for a detailed example.
As of TDS 4.2, you should use the featureCollection element in your configuration catalog. (The previous way of doing this was with a datasetFmrc element, which is now deprecated.)
Download catalogFmrc.xml, place it in your TDS ${tomcat_home}/content/thredds directory
and add a catalogRef to it from your main catalog:
<catalog xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0"
xmlns:xlink="http://www.w3.org/1999/xlink" name="Unidata THREDDS Data Server"
version="1.0.3">
1) <service name="ncdods" serviceType="OPENDAP" base="/thredds/dodsC/"/>
2) <featureCollection featureType="FMRC" name="NCEP-GFS-Puerto_Rico_191km" harvest="true"
path="fmrc/NCEP/GFS/Puerto_Rico_191km">
3) <metadata inherited="true">
<serviceName>ncdods</serviceName>
<dataFormat>GRIB-1</dataFormat>
<documentation type="summary">Specially good GFS_Puerto_Rico_191km</documentation>
</metadata>
4) <collection spec="/data/testdata/2010TdsTW/fmrc/GFS_Puerto_Rico_191km.*grib1$"/>
</featureCollection>
</catalog>
ncdods.featureCollection is defined, of type FMRC, whose contained datasets will all have a path starting with fmrc/NCEP/GFS/Puerto_Rico_191km.metadata contained here will be inherited by the contained datasets.collection of files is defined, using a collection specification string. The directory /data/testdata/2010TdsTW/fmrc/ will be scanned for files that start with "GFS_Puerto_Rico_191km", and end with "grib1".The contained datasets include the resulting 2D time dimension dataset, as well as the 1D time views of the ucar.nc2.dt.fmrc.ForecastModelRunCollection dataset described in section 4 above, as seen in the resulting HTML page for that dataset:
The TDS has created a number of datasets out of the FMRC, and made them available through the catalog interface. Explore them through the browser, through ToolsUI or the IDV.
Grib files are processed by the CDM, and the runtime is found from the GRIB header information and added to the global attributes automatically, in this example here are the global attributes for the first file:
:Conventions = "CF-1.4";
:Originating_center = "US National Weather Service - NCEP(WMC) (7)";
:Generating_Model = "Analysis from Global Data Assimilation System";
:Product_Type = "product valid at reference time P1";
:title = "US National Weather Service - NCEP(WMC) Analysis from Global Data Assimilation System product valid at reference time P1";
:institution = "Center US National Weather Service - NCEP(WMC) (7)";
:source = "product valid at reference time P1";
:history = "Direct read of GRIB-1 into NetCDF-Java 4 API";
:CF:feature_type = "GRID";
:file_format = "GRIB-1";
:location = "Q:/2010TdsTW/fmrc/GFS_Puerto_Rico_191km_20101105_0000.grib1";
:_CoordinateModelRunDate = "2010-11-05T00:00:00Z";
Note that the last attribute listed shows the _CoordinateModelRunDate, which is used by the FMRC processing to group
the files by run date.
The recommended way to specify this information for non-GRIB files is to put the run time information into the filename. You then specify a date parsing template in the collection specification string, for example:
<collection spec="/data/testdata/2010TdsTW/fmrc/GFS_Puerto_Rico_191km_#yyyyMMdd_HHmm#\.nc$"/>
extracts the run date by applying the template yyyyMMdd_HHmm to the portion of the filename after "GFS_Puerto_Rico_191km_"
The above example creates a static collection of files. A common case is that one has a collection of files that are changing, as files are added and deleted while being served through the TDS. Below is a modified version of the catalog, with additional elements and attributes to handle this case:
<catalog xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0"
xmlns:xlink="http://www.w3.org/1999/xlink" name="Unidata THREDDS Data Server"
version="1.0.3">
<service name="ncdods" serviceType="OPENDAP" base="/thredds/dodsC/"/>
<featureCollection featureType="FMRC" name="NCEP-GFS-Puerto_Rico_191km" harvest="true"
path="fmrc/NCEP/GFS/Puerto_Rico_191km">
<metadata inherited="true">
<serviceName>ncdods</serviceName>
<dataFormat>GRIB-1</dataFormat>
<documentation type="summary">Specially good GFS_Puerto_Rico_191km</documentation>
</metadata>
<collection spec="/data/testdata/2010TdsTW/fmrc/GFS_Puerto_Rico_191km.*grib1$"
name="GFS_Puerto_Rico"
1) recheckAfter="15 min"
2) olderThan="5 min"/>
3) <update startup="true" rescan="0 5 3 * * ? *" trigger="allow"/>
4) <protoDataset choice="Penultimate" change="0 2 3 * * ? *" />
5) <fmrcConfig datasetTypes="TwoD Best Runs ConstantForecasts ConstantOffsets Files" />
</featureCollection>
</catalog>
recheckAfter: When a request comes in, if the collection hasn't been checked for 15 minutes,check to see if it has changed. The request will wait until the rescan is finished and a new collection is built (if needed). This minimizes unneeded processing for lightly used collections.olderThan: Only files that haven't changed for 5 minutes will be included. This excludes files that are in the middle of being written.update: The collection will be updated upon TDS startup, and periodically using the cron expression "0 5 3 * * ? *", meaning every day at 3:05 am local time. This updating is done in the background, as opposed to when a request for it comes in.protoDataset: The prototypical dataset is chosen to be the "next-to-latest". The prototypical dataset is changed every day at 3:02 am local time.fmrcConfig: The kinds of datasets that are created are listed explicitly. You can see how this corresponds directly to the HTML dataset page above. Remove the ones that you don't want to make available. Default is "TwoD Best Files Runs"The recheckAfter attribute and the update element are really alternate ways to specify rescanning strategies. Use the update element on large collections when you want to ensure quick response. Use the code>recheckAfter on lightly used collections in order to minimize server load.