Re: [thredds] WMS servers side image cache

  • To: Guan Wang <gwang@xxxxxxx>
  • Subject: Re: [thredds] WMS servers side image cache
  • From: Heiko Klein <Heiko.Klein@xxxxxx>
  • Date: Tue, 23 Aug 2011 10:21:49 +0200


On 2011-08-23 03:13, Guan Wang wrote:
Hi Keiko,

Do you expect TDS should implement some sort of FileSystemMonitor that could 
automatically detect the change on the file system?

Yes, I would wish that.


Does this exist in the JDK? Apache Commons JCI?

The most primitive and maybe most used is i.e. java.io.File.lastModified(). This is used in apache/jetty/tomcat for lastModifiedSince in serving static content.

Aggregation will be the biggest problem, in particularly from remote TDS/Opendap servers.


Personally, I think the "faster algorithms or data-caches" are also very 
important for an application like TDS.

Faster algorithms are very important, but on a completely different scale. Algorithms are important for single/first-time users, caching is for the masses.


BTW, Keiko, what's the size of your dataset?

Single files are ~ 1GB, the virtual aggregated datasets are TB. See i.e. http://thredds.met.no/thredds/myocean/ARC-MFC/myoceanv1-class1-arctic.html


BTW: Keiko was the name of the orca in Free Willy.


Regards,

Heiko


Thanks,

Guan

----- Original Message -----
From: "Heiko Klein"<Heiko.Klein@xxxxxx>
To: thredds@xxxxxxxxxxxxxxxx
Sent: Monday, August 22, 2011 4:03:15 AM
Subject: Re: [thredds] WMS servers side image cache

Hi,

I think the important thing in caching is the detection of 'when did
something change', rather than faster algorithms or data-caches to
create pictures.

Currently, to deliver usefull response times, we use a simple expiration
filter in thredds web.xml:

<filter-mapping>
    <filter-name>Cache10dayFilter</filter-name>
    <url-pattern>/wms/*</url-pattern>
</filter-mapping>

Through the url-pattern, this can be tuned for the different datasets.
Then, we add a off-the-shelf web-cache (varnish, squid, apache
mod-diskcache) to do the actual caching.


Performance is excellent, the problem is: How do we detect updates to
the data-files? I don't know how well thredds itself detects those
changes in it's cache - I have often used the thredds admin-console
'disable caches' to 'fix' data-updates. Support for 'last changed since'
or 'etags' might be more useful than another cache-level inside of thredds.

Regards,

Heiko

On 2011-08-19 18:20, Guan Wang wrote:
Hi Jon,

Thank you for the follow up on this one!

Here is my wish-list on caching:

1. Cache for historical datasets is a good idea as long as the costs are
acceptable;

2. Cache for current datasets that subject to change could be offered in a
way that user can set the timer on the expiry time;

3. Cache image is not necessary. But the improvement on algorithm in ncWMS
for data re-sampling/interpolation (alternative could be through cache??) is
needed (correct me if this has been done in the new release against
irregular grids);

4. In order to improve the hit rate on cache and limit the consumption on
the resources, "user controlled cache" may be necessary. Finer configuration
levels could be offered, say only caching the dataset for a particular
period, in a specific area or certain parameters;

Thanks,

Guan

-----Original Message-----
From: thredds-bounces@xxxxxxxxxxxxxxxx
[mailto:thredds-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Jon Blower
Sent: Friday, August 19, 2011 11:50 AM
To: thredds@xxxxxxxxxxxxxxxx
Subject: Re: [thredds] WMS servers side image cache

Hi Jay, Guan, Ethan, all,

Just to pick up on this old thread, sorry for my late arrival!  As Guan
says, ncWMS does cache extracted data arrays: it doesn't cache images, but
the arrays of extracted data that are used to generate images. This allows
fast altering of image styling parameters like the colour map: the data
don't have to be re-extracted to change the palette.

However, this isn't in THREDDS yet.  I think a big part of the problem is
ensuring cache consistency - it's hard to know when an underlying dataset
might have changed.  A partial solution could be found by setting a very
short expiry time on items in the cache (say 1 minute), which might give
some benefit for visual data browsing in Godiva2, while ensuring that the
data were unlikely to be out of date.

With a bit more head-scratching this could probably be done - any more votes
from the user community...?

Cheers, Jon


------------------------------

Message: 2
Date: Thu, 12 May 2011 13:20:54 -0600
From: Ethan Davis<edavis@xxxxxxxxxxxxxxxx>
To: thredds@xxxxxxxxxxxxxxxx
Subject: Re: [thredds] WMS servers side image  cache
Message-ID:<4DCC3316.100@xxxxxxxxxxxxxxxx>
Content-Type: text/plain; charset=ISO-8859-1

Hi Jay,

The TDS does use ehcache in a number of ways. I believe mainly for tracking
collections of real-time updated files on disk. John can go into the details
if you're interested.

The TDS does not use the ncWMS data array caching code that Guan mentions
below. Most of the current TDS caching is at the dataset level rather than
at the level of the actual data.

Ethan

On 5/12/2011 12:41 PM, Jay Alder wrote:
There is a ehcache.xml spring config file in WEB-INF which looks
promising but I don't see anything about WMS in the cache documentation.

http://www.unidata.ucar.edu/projects/THREDDS/tech/reference/ThreddsCon
figXMLFile.html#Cache_Locations


My cache directories are essentially empty. Can someone confirm or
deny thredds WMS is using or could use ehcache? It would be great if I
could use ehcache to cache the data for my tiles (and have them
persist) so I can change the image styling on the fly (best of both
worlds).



On 05/12/2011 10:36 AM, Guan Wang wrote:
In the original ncWMS package, caching has been considered through
Ehcache.

http://www.resc.rdg.ac.uk/trac/ncWMS/browser/trunk/src/java/uk/ac/rdg
/resc/n

cwms/cache/TileCache.java

Not sure if this is still kept after the integration with THREDDS.

Guan
-----Original Message-----
From: thredds-bounces@xxxxxxxxxxxxxxxx
[mailto:thredds-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Jay Alder
Sent: Thursday, May 12, 2011 1:09 PM
To: THREDDS Users
Subject: [thredds] WMS servers side image cache

Hi, I'm playing around with using Thredds WMS using a tiling scheme
similar to google maps and I was wondering if there was a way (or
something that could be added in the future) to cache WMS maps from
the thredds servlet? Since I'm using tiles, the requests are
identical. I was thinking thredds would be able to hash (say md5) the
request url and write the image to temporary folder. On the second
request thredds would look in the cache based on the hash, and return
the image if its there otherwise generate a new image.  Thredds would
likely need to wipe the cache on a restart in case the data changed.

Using IDL I wrote some code that downloaded all the tiles at all
levels for a particular dataset, but it took about 12 hours to run
and generated 400,000+ tiles for only one variable. If thredds cached
images it would be a more "on demand" approach rather than trying to
pre generate millions of tiles. In the tiled world, caching would say
a lot of server side processing (especially for high resolution
datasets) and speed up the user experience.

-Jay Alder


_______________________________________________
thredds mailing list
thredds@xxxxxxxxxxxxxxxx
For list information or to unsubscribe,  visit: 
http://www.unidata.ucar.edu/mailing_lists/

--
Dr. Heiko Klein                              Tel. + 47 22 96 32 58
Development Section / IT Department          Fax. + 47 22 69 63 55
Norwegian Meteorological Institute           http://www.met.no
P.O. Box 43 Blindern  0313 Oslo NORWAY



  • 2011 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: