[thredds] Simple fix => much smaller TDS WMS GetCapabilities size (for model output)

  • To: THREDDS community <thredds@xxxxxxxxxxxxxxxx>
  • Subject: [thredds] Simple fix => much smaller TDS WMS GetCapabilities size (for model output)
  • From: Rich Signell <rsignell@xxxxxxxx>
  • Date: Wed, 22 Dec 2010 08:00:26 -0500
THREDDS folks,

For getCapabilities requests, it would be great if the TDS would
express the available times using the WMS multiple time interval
syntax if it is more efficient than listing each time value
separately.

This can result in huge (100 or more) savings in the WMS
getCapabilities size when dealing with model output, which is usually
equally spaced, but perhaps with a few gaps.   Instead of listing
every available time step in ISO format, as is done currently

In WMS 1.1.1, Annex C.3 states that multiple intervals are allowed in
the "Extent" element.
In WMS 1.3.0, Annex C.2 states that multiple intervals are allowed in
the "Dimension" element.

Both list the sample format: "min1/max1/res1,min2/max2/res2,..."
(thanks to Kyle Wilcox for digging out this info)

For example, we have a dataset

http://testbedapps.sura.org/thredds/clean.html?dataset=estuarine_hypoxia/ch3d/agg

that contains hourly output over 21 years (183984 time records) for 11
different variables.  There are only 4 gaps longer than 1 hour.

If you access the WMS getCapabilities document for this dataset, be
prepared to wait for a while, because it's 51Mb!!

The problem is that each time value is listed in ISO ASCII:
        <Dimension name="time" units="ISO8601" multipleValues="true"
current="true" default="2006-01-01T00:00:00.000Z">
1985-01-01T01:00:00.000Z,1985-01-01T02:00:00.000Z,1985-01-01T03:00:00.000Z,1985-01-01T04:00:00.000Z,1985-01-01T05:00:00.000Z,1985-01-01T06:00:00.000Z,1985-01-01T07:00:00.000Z,1985-01-01T08:00:00.000Z,1985-01-01T09:00:00.000Z,1985-01-01T10:00:00.000Z,
...

which goes on for 5MB of ASCII values and then this whole mess is
repeated for each variable "layer".

Instead, the entire time record could be simply expressed using 5
intervals of the form:

"1985-01-01T01:00:00.000Z/1988-12-31T00:00:00.000Z/PT3600S",
"1989-01-01T01:00:00.000Z/1992-12-31T00:00:00.000Z/PT3600S",
"1993-01-01T01:00:00.000Z/2000-12-31T00:00:00.000Z/PT3600S",
"2001-01-01T01:00:00.000Z/2004-12-31T00:00:00.000Z/PT3600S",
"2005-01-01T01:00:00.000Z/2006-01-01T00:00:00.000Z/PT3600S",


Existing way (every time step written out): 51MB
New way (specifying intervals):                100Kb   (500 times smaller!!!!!)

This would greatly reduce the file sizes on Motherlode, our IOOS
testbed server, and every other TDS (or WMS, actually) with aggregated
model output.

On a related note, it would be great to be able to specify multiple
time intervals in NcML, instead of just one time interval, as
evenly-spaced-data-but-with-a-few-gaps is a very common type of model
output aggregation.

I hope we can look forward to this in some release of TDS in the New
Year!!!  ;-)

Thanks,
Rich

-- 
Dr. Richard P. Signell   (508) 457-2229
USGS, 384 Woods Hole Rd.
Woods Hole, MA 02543-1598



  • 2010 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: