data file compression / cache / bz2?

John Caron caron at unidata.ucar.edu
Fri Aug 11 16:44:48 MDT 2006



Rob Cermak wrote:
> On Fri, August 11, 2006 12:15 pm, John Caron wrote:
> 
>>Hi Rob:
>>
>>The files should be in content/thredds/cache/ and TDS allows up to
>>1 Gbyte before deleting oldest files.
> 
> 
> This is the version we have:
> 
> THREDDS Data Server Version 3.10.0   Build Date = 2006-06-05
> 17:47:42
> 
> Hmmm.  This is not the behavior we experienced here.
> 
> Excerpt from the catalog.xml:
> 
>   <service name="robDODS"
> serviceType="OpenDAP"base="/thredds/dodsC/">
>     <datasetRoot path="data" location="/space/data/GFS/"/>
>   </service>
> 
>   <datasetScan name="GFS" ID="GFSscan"
>     path="GFS" location="/space/data/GFS">
> 
>     <metadata inherited="true">
>       <serviceName>robDODS</serviceName>
>     </metadata>
> 
>     <filter>
>       <include wildcard="*"/>
>     </filter>
> 
>   </datasetScan>
> 
> The tomcat installation is in a completely separate path to the
> data.  The data is in /space/data/GFS.
> 
> When we access the data via thredds it unpacks it in the directory
> where it resides.
> 
> [cermak at daved thredds]$ ls -lh /space/data/GFS
> total 1.3G
> -rw-r--r--  1 cermak users 306M Jul 20 20:56 gfs_20060720_00z.nc
> -rw-r--r--  1 cermak users 306M Jul 20 20:56 gfs_20060720_06z.nc
> -rw-r--r--  1 cermak users  22M Jul 20 20:56 gfs_20060720_12z.nc.bz2
> -rw-r--r--  1 cermak users 306M Aug 11 11:37 gfs_20060720_18z.nc
> -rw-r--r--  1 cermak users  48M Jul 20 20:57 gfs_20060720_18z.nc.gz

Yes, sorry I misspoke (miswrote?). It unpacks in same directory if it can. If it doesnt have write permission in the data direcory, then it uses the cache directory as I described.


> 
> 
>>It is possible to change the
>>place
> 
> 
> I may not have discovered where this setting is.  By default, it
> does not seem to be: content/thredds/cache/
> 
> 
>>but not yet the maximum size of the cache. Another possible
>>cache stategy is to delete files after a certain age (which makes
> 
> more sense for a rolling archive like motherlode).
> 
>>Im going to make some/all of this user settable. what would you
> 
> like
> 
>>to do?
> 
> 
> Unpacking in place rather than a cache directory will simplify
> things.  Thredds will have to keep an internal list of files it
> recently unpacked.  I'm not an expert - whatever makes logical sense
> to the developers is fine.  Being able to set an age or space
> utilization watermark, cache location (if desired), would be good.
> 
> Understanding the caching mechanism will help us understand how to
> structure the data on the server; allow enough space for a partition
> for a cache directory or enough spare space for in place
> uncompression.  As much as I like to keep things unpacked, to put
> more data onto the server, we will have to compress things.  This is
> a nice feature of opendap server 3.
> 
> 
>>If file ends with ".Z", ".zip", ".gzip", ".gz", or ".bz2", it will
>>uncompress/unzip and write to new file without the suffix.
> 
> 
> It would be great to have the majority of compression methods!
> 
> I tried with .bz2 and it failed to recognize the file.
> 
> ]DODServlet ERROR: Cant read
> /space/data/GFS/gfs_20060720_12z.nc.bz2: not a valid NetCDF file.

You will need the new version, 3.12. Its on the web site, but we havent yet announced, still testing it.

> 
> We can follow up next week.  I can bring up the test server so we
> can poke around.  Have a good weekend!

bon weekend!

> 
> ==============================================================================
> To unsubscribe thredds, visit:
> http://www.unidata.ucar.edu/mailing-list-delete-form.html
> ==============================================================================

==============================================================================
To unsubscribe thredds, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================



More information about the Thredds mailing list