Re: [thredds] WMS servers side image cache

  • To: Heiko Klein <Heiko.Klein@xxxxxx>
  • Subject: Re: [thredds] WMS servers side image cache
  • From: Guan Wang <gwang@xxxxxxx>
  • Date: Mon, 22 Aug 2011 21:13:44 -0400 (EDT)
Hi Keiko,

Do you expect TDS should implement some sort of FileSystemMonitor that could 
automatically detect the change on the file system?

Does this exist in the JDK? Apache Commons JCI?

Personally, I think the "faster algorithms or data-caches" are also very 
important for an application like TDS.

BTW, Keiko, what's the size of your dataset? 

Thanks,

Guan

----- Original Message -----
From: "Heiko Klein" <Heiko.Klein@xxxxxx>
To: thredds@xxxxxxxxxxxxxxxx
Sent: Monday, August 22, 2011 4:03:15 AM
Subject: Re: [thredds] WMS servers side image cache

Hi,

I think the important thing in caching is the detection of 'when did 
something change', rather than faster algorithms or data-caches to 
create pictures.

Currently, to deliver usefull response times, we use a simple expiration 
filter in thredds web.xml:

<filter-mapping>
   <filter-name>Cache10dayFilter</filter-name>
   <url-pattern>/wms/*</url-pattern>
</filter-mapping>

Through the url-pattern, this can be tuned for the different datasets. 
Then, we add a off-the-shelf web-cache (varnish, squid, apache 
mod-diskcache) to do the actual caching.


Performance is excellent, the problem is: How do we detect updates to 
the data-files? I don't know how well thredds itself detects those 
changes in it's cache - I have often used the thredds admin-console 
'disable caches' to 'fix' data-updates. Support for 'last changed since' 
or 'etags' might be more useful than another cache-level inside of thredds.

Regards,

Heiko

On 2011-08-19 18:20, Guan Wang wrote:
> Hi Jon,
>
> Thank you for the follow up on this one!
>
> Here is my wish-list on caching:
>
> 1. Cache for historical datasets is a good idea as long as the costs are
> acceptable;
>
> 2. Cache for current datasets that subject to change could be offered in a
> way that user can set the timer on the expiry time;
>
> 3. Cache image is not necessary. But the improvement on algorithm in ncWMS
> for data re-sampling/interpolation (alternative could be through cache??) is
> needed (correct me if this has been done in the new release against
> irregular grids);
>
> 4. In order to improve the hit rate on cache and limit the consumption on
> the resources, "user controlled cache" may be necessary. Finer configuration
> levels could be offered, say only caching the dataset for a particular
> period, in a specific area or certain parameters;
>
> Thanks,
>
> Guan
>
> -----Original Message-----
> From: thredds-bounces@xxxxxxxxxxxxxxxx
> [mailto:thredds-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Jon Blower
> Sent: Friday, August 19, 2011 11:50 AM
> To: thredds@xxxxxxxxxxxxxxxx
> Subject: Re: [thredds] WMS servers side image cache
>
> Hi Jay, Guan, Ethan, all,
>
> Just to pick up on this old thread, sorry for my late arrival!  As Guan
> says, ncWMS does cache extracted data arrays: it doesn't cache images, but
> the arrays of extracted data that are used to generate images. This allows
> fast altering of image styling parameters like the colour map: the data
> don't have to be re-extracted to change the palette.
>
> However, this isn't in THREDDS yet.  I think a big part of the problem is
> ensuring cache consistency - it's hard to know when an underlying dataset
> might have changed.  A partial solution could be found by setting a very
> short expiry time on items in the cache (say 1 minute), which might give
> some benefit for visual data browsing in Godiva2, while ensuring that the
> data were unlikely to be out of date.
>
> With a bit more head-scratching this could probably be done - any more votes
> from the user community...?
>
> Cheers, Jon
>
>
> ------------------------------
>
> Message: 2
> Date: Thu, 12 May 2011 13:20:54 -0600
> From: Ethan Davis<edavis@xxxxxxxxxxxxxxxx>
> To: thredds@xxxxxxxxxxxxxxxx
> Subject: Re: [thredds] WMS servers side image  cache
> Message-ID:<4DCC3316.100@xxxxxxxxxxxxxxxx>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hi Jay,
>
> The TDS does use ehcache in a number of ways. I believe mainly for tracking
> collections of real-time updated files on disk. John can go into the details
> if you're interested.
>
> The TDS does not use the ncWMS data array caching code that Guan mentions
> below. Most of the current TDS caching is at the dataset level rather than
> at the level of the actual data.
>
> Ethan
>
> On 5/12/2011 12:41 PM, Jay Alder wrote:
>> There is a ehcache.xml spring config file in WEB-INF which looks
>> promising but I don't see anything about WMS in the cache documentation.
>>
>> http://www.unidata.ucar.edu/projects/THREDDS/tech/reference/ThreddsCon
>> figXMLFile.html#Cache_Locations
>>
>>
>> My cache directories are essentially empty. Can someone confirm or
>> deny thredds WMS is using or could use ehcache? It would be great if I
>> could use ehcache to cache the data for my tiles (and have them
>> persist) so I can change the image styling on the fly (best of both
> worlds).
>>
>>
>>
>> On 05/12/2011 10:36 AM, Guan Wang wrote:
>>> In the original ncWMS package, caching has been considered through
>>> Ehcache.
>>>
>>> http://www.resc.rdg.ac.uk/trac/ncWMS/browser/trunk/src/java/uk/ac/rdg
>>> /resc/n
>>>
>>> cwms/cache/TileCache.java
>>>
>>> Not sure if this is still kept after the integration with THREDDS.
>>>
>>> Guan
>>> -----Original Message-----
>>> From: thredds-bounces@xxxxxxxxxxxxxxxx
>>> [mailto:thredds-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Jay Alder
>>> Sent: Thursday, May 12, 2011 1:09 PM
>>> To: THREDDS Users
>>> Subject: [thredds] WMS servers side image cache
>>>
>>> Hi, I'm playing around with using Thredds WMS using a tiling scheme
>>> similar to google maps and I was wondering if there was a way (or
>>> something that could be added in the future) to cache WMS maps from
>>> the thredds servlet? Since I'm using tiles, the requests are
>>> identical. I was thinking thredds would be able to hash (say md5) the
>>> request url and write the image to temporary folder. On the second
>>> request thredds would look in the cache based on the hash, and return
>>> the image if its there otherwise generate a new image.  Thredds would
>>> likely need to wipe the cache on a restart in case the data changed.
>>>
>>> Using IDL I wrote some code that downloaded all the tiles at all
>>> levels for a particular dataset, but it took about 12 hours to run
>>> and generated 400,000+ tiles for only one variable. If thredds cached
>>> images it would be a more "on demand" approach rather than trying to
>>> pre generate millions of tiles. In the tiled world, caching would say
>>> a lot of server side processing (especially for high resolution
>>> datasets) and speed up the user experience.
>>>
>>> -Jay Alder
>>>

_______________________________________________
thredds mailing list
thredds@xxxxxxxxxxxxxxxx
For list information or to unsubscribe,  visit: 
http://www.unidata.ucar.edu/mailing_lists/ 



  • 2011 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: