Re: [thredds] Simple fix => much smaller TDS WMS GetCapabilities size (for model output)

By missing "values", I assume you mean missing times?  If the spec
allows for multiple intervals in the Dimension element ... could you not
just stop one interval and start another, leaving out the missing
times?  Even if you end up with several intervals it would still be
better than listing out every time.

- Mike Grogan

On 12/22/2010 13:14, Doug Lindholm wrote:
> I'd like to add to that. At what point is it sufficient to fill gaps
> (assuming uniform time steps) with fill values? In some cases,
> determining the precise time samples available is an expensive task.
> I'd rather get a single time range quickly then deal with fill values
> (such as NaNs which often require no special handling).
>
> Doug
>
> On 12/22/10 11:01 AM, Bob Simons wrote:
>> Some other issues to consider:
>>
>> * Where is the dividing line between a few missing values (where it's
>> "okay" to say the values are evenly spaced) and too many missing values
>> (where it isn't okay)?  Some number?  Some percentage?  Does it matter
>> if the missing values are adjacent or scattered?
>>
>> * getCapabilities purpose is to tell the client what is available. It is
>> probably fine for a human to read that the data is evenly spaced (even
>> if it isn't perfectly); humans are sometimes forgiving.  But will there
>> be problems if a computer program client expects (perfectly) evenly
>> spaced values and requests data for those values?
>>
>> On 12/22/2010 9:47 AM, Ethan Davis wrote:
>>> Hi Rich,
>>>
>>> I've added this feature request to our list. Jon Blower might have some
>>> thoughts on this as well.
>>>
>>> One thing that I wonder about is client support. In particular, does
>>> Godiva2 support this? Again, a question for Jon.
>>>
>>> Ethan
>>>
>>> On 12/22/2010 6:00 AM, Rich Signell wrote:
>>>> For getCapabilities requests, it would be great if the TDS would
>>>> express the available times using the WMS multiple time interval
>>>> syntax if it is more efficient than listing each time value
>>>> separately.
>>>>
>>>> This can result in huge (100 or more) savings in the WMS
>>>> getCapabilities size when dealing with model output, which is usually
>>>> equally spaced, but perhaps with a few gaps.   Instead of listing
>>>> every available time step in ISO format, as is done currently
>>>>
>>>> In WMS 1.1.1, Annex C.3 states that multiple intervals are allowed in
>>>> the "Extent" element.
>>>> In WMS 1.3.0, Annex C.2 states that multiple intervals are allowed in
>>>> the "Dimension" element.
>>>>
>>>> Both list the sample format: "min1/max1/res1,min2/max2/res2,..."
>>>> (thanks to Kyle Wilcox for digging out this info)
>>>>
>>>> For example, we have a dataset
>>>>
>>>> http://testbedapps.sura.org/thredds/clean.html?dataset=estuarine_hypoxia/ch3d/agg
>>>>
>>>>
>>>> that contains hourly output over 21 years (183984 time records) for 11
>>>> different variables.  There are only 4 gaps longer than 1 hour.
>>>>
>>>> If you access the WMS getCapabilities document for this dataset, be
>>>> prepared to wait for a while, because it's 51Mb!!
>>>>
>>>> The problem is that each time value is listed in ISO ASCII:
>>>>           <Dimension name="time" units="ISO8601" multipleValues="true"
>>>> current="true" default="2006-01-01T00:00:00.000Z">
>>>> 1985-01-01T01:00:00.000Z,1985-01-01T02:00:00.000Z,1985-01-01T03:00:00.000Z,1985-01-01T04:00:00.000Z,1985-01-01T05:00:00.000Z,1985-01-01T06:00:00.000Z,1985-01-01T07:00:00.000Z,1985-01-01T08:00:00.000Z,1985-01-01T09:00:00.000Z,1985-01-01T10:00:00.000Z,
>>>>
>>>> ...
>>>>
>>>> which goes on for 5MB of ASCII values and then this whole mess is
>>>> repeated for each variable "layer".
>>>>
>>>> Instead, the entire time record could be simply expressed using 5
>>>> intervals of the form:
>>>>
>>>> "1985-01-01T01:00:00.000Z/1988-12-31T00:00:00.000Z/PT3600S",
>>>> "1989-01-01T01:00:00.000Z/1992-12-31T00:00:00.000Z/PT3600S",
>>>> "1993-01-01T01:00:00.000Z/2000-12-31T00:00:00.000Z/PT3600S",
>>>> "2001-01-01T01:00:00.000Z/2004-12-31T00:00:00.000Z/PT3600S",
>>>> "2005-01-01T01:00:00.000Z/2006-01-01T00:00:00.000Z/PT3600S",
>>>>
>>>>
>>>> Existing way (every time step written out): 51MB
>>>> New way (specifying intervals):                100Kb   (500 times
>>>> smaller!!!!!)
>>>>
>>>> This would greatly reduce the file sizes on Motherlode, our IOOS
>>>> testbed server, and every other TDS (or WMS, actually) with aggregated
>>>> model output.
>>>
>>> _______________________________________________
>>> thredds mailing list
>>> thredds@xxxxxxxxxxxxxxxx
>>> For list information or to unsubscribe,  visit:
>>> http://www.unidata.ucar.edu/mailing_lists/
>>
>
> _______________________________________________
> thredds mailing list
> thredds@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe,  visit:
> http://www.unidata.ucar.edu/mailing_lists/ 



  • 2010 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: