Re: [thredds] WMS servers side image cache

Hi Heiko,

> I think the important thing in caching is the detection of 'when did 
> something change', rather than faster algorithms or data-
> caches to create pictures.

Yes, certainly.  Ethan or John could correct me if I'm wrong, but I don't think 
THREDDS has a built-in way of detecting change, or for giving an expiry time 
for a dataset.  Any cache configuration (within or outside THREDDS) would 
probably need to have such parameters configured separately.

> Support for 'last changed since' 
> or 'etags' might be more useful than another cache-level inside of thredds.

HTTP cache control is a great solution if you want to cache resulting *images* 
(which you probably do in a production environment).  The reason for a separate 
layer of cache in standalone ncWMS is to allow *data arrays* to be cached - 
this allows the user to change the colour palette or scale range (hence 
generating a new image) without re-extracting data.  This doesn't exist in 
THREDDS, but may be a useful addition.

Cheers, Jon


Message: 1
Date: Mon, 22 Aug 2011 10:03:15 +0200
From: Heiko Klein <Heiko.Klein@xxxxxx>
To: thredds@xxxxxxxxxxxxxxxx
Subject: Re: [thredds] WMS servers side image cache
Message-ID: <4E520D43.4010108@xxxxxx>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed


I think the important thing in caching is the detection of 'when did something 
change', rather than faster algorithms or data-caches to create pictures.

Currently, to deliver usefull response times, we use a simple expiration filter 
in thredds web.xml:


Through the url-pattern, this can be tuned for the different datasets. 
Then, we add a off-the-shelf web-cache (varnish, squid, apache
mod-diskcache) to do the actual caching.

Performance is excellent, the problem is: How do we detect updates to 
the data-files? I don't know how well thredds itself detects those 
changes in it's cache - I have often used the thredds admin-console 
'disable caches' to 'fix' data-updates. Support for 'last changed since' 
or 'etags' might be more useful than another cache-level inside of thredds.



On 2011-08-19 18:20, Guan Wang wrote:
> Hi Jon,
> Thank you for the follow up on this one!
> Here is my wish-list on caching:
> 1. Cache for historical datasets is a good idea as long as the costs are
> acceptable;
> 2. Cache for current datasets that subject to change could be offered in a
> way that user can set the timer on the expiry time;
> 3. Cache image is not necessary. But the improvement on algorithm in ncWMS
> for data re-sampling/interpolation (alternative could be through cache??) is
> needed (correct me if this has been done in the new release against
> irregular grids);
> 4. In order to improve the hit rate on cache and limit the consumption on
> the resources, "user controlled cache" may be necessary. Finer configuration
> levels could be offered, say only caching the dataset for a particular
> period, in a specific area or certain parameters;
> Thanks,
> Guan
> -----Original Message-----
> From: thredds-bounces@xxxxxxxxxxxxxxxx
> [mailto:thredds-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Jon Blower
> Sent: Friday, August 19, 2011 11:50 AM
> To: thredds@xxxxxxxxxxxxxxxx
> Subject: Re: [thredds] WMS servers side image cache
> Hi Jay, Guan, Ethan, all,
> Just to pick up on this old thread, sorry for my late arrival!  As Guan
> says, ncWMS does cache extracted data arrays: it doesn't cache images, but
> the arrays of extracted data that are used to generate images. This allows
> fast altering of image styling parameters like the colour map: the data
> don't have to be re-extracted to change the palette.
> However, this isn't in THREDDS yet.  I think a big part of the problem is
> ensuring cache consistency - it's hard to know when an underlying dataset
> might have changed.  A partial solution could be found by setting a very
> short expiry time on items in the cache (say 1 minute), which might give
> some benefit for visual data browsing in Godiva2, while ensuring that the
> data were unlikely to be out of date.
> With a bit more head-scratching this could probably be done - any more votes
> from the user community...?
> Cheers, Jon
> ------------------------------
> Message: 2
> Date: Thu, 12 May 2011 13:20:54 -0600
> From: Ethan Davis<edavis@xxxxxxxxxxxxxxxx>
> To: thredds@xxxxxxxxxxxxxxxx
> Subject: Re: [thredds] WMS servers side image  cache
> Message-ID:<4DCC3316.100@xxxxxxxxxxxxxxxx>
> Content-Type: text/plain; charset=ISO-8859-1
> Hi Jay,
> The TDS does use ehcache in a number of ways. I believe mainly for tracking
> collections of real-time updated files on disk. John can go into the details
> if you're interested.
> The TDS does not use the ncWMS data array caching code that Guan mentions
> below. Most of the current TDS caching is at the dataset level rather than
> at the level of the actual data.
> Ethan
> On 5/12/2011 12:41 PM, Jay Alder wrote:
>> There is a ehcache.xml spring config file in WEB-INF which looks
>> promising but I don't see anything about WMS in the cache documentation.
>> figXMLFile.html#Cache_Locations
>> My cache directories are essentially empty. Can someone confirm or
>> deny thredds WMS is using or could use ehcache? It would be great if I
>> could use ehcache to cache the data for my tiles (and have them
>> persist) so I can change the image styling on the fly (best of both
> worlds).
>> On 05/12/2011 10:36 AM, Guan Wang wrote:
>>> In the original ncWMS package, caching has been considered through
>>> Ehcache.
>>> /resc/n
>>> cwms/cache/
>>> Not sure if this is still kept after the integration with THREDDS.
>>> Guan
>>> -----Original Message-----
>>> From: thredds-bounces@xxxxxxxxxxxxxxxx
>>> [mailto:thredds-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Jay Alder
>>> Sent: Thursday, May 12, 2011 1:09 PM
>>> To: THREDDS Users
>>> Subject: [thredds] WMS servers side image cache
>>> Hi, I'm playing around with using Thredds WMS using a tiling scheme
>>> similar to google maps and I was wondering if there was a way (or
>>> something that could be added in the future) to cache WMS maps from
>>> the thredds servlet? Since I'm using tiles, the requests are
>>> identical. I was thinking thredds would be able to hash (say md5) the
>>> request url and write the image to temporary folder. On the second
>>> request thredds would look in the cache based on the hash, and return
>>> the image if its there otherwise generate a new image.  Thredds would
>>> likely need to wipe the cache on a restart in case the data changed.
>>> Using IDL I wrote some code that downloaded all the tiles at all
>>> levels for a particular dataset, but it took about 12 hours to run
>>> and generated 400,000+ tiles for only one variable. If thredds cached
>>> images it would be a more "on demand" approach rather than trying to
>>> pre generate millions of tiles. In the tiled world, caching would say
>>> a lot of server side processing (especially for high resolution
>>> datasets) and speed up the user experience.
>>> -Jay Alder

End of thredds Digest, Vol 31, Issue 14

  • 2011 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: