Re: data file compression / cache / bz2?



Rob Cermak wrote:
On Fri, August 11, 2006 12:15 pm, John Caron wrote:

Hi Rob:

The files should be in content/thredds/cache/ and TDS allows up to
1 Gbyte before deleting oldest files.


This is the version we have:

THREDDS Data Server Version 3.10.0   Build Date = 2006-06-05
17:47:42

Hmmm.  This is not the behavior we experienced here.

Excerpt from the catalog.xml:

  <service name="robDODS"
serviceType="OpenDAP"base="/thredds/dodsC/">
    <datasetRoot path="data" location="/space/data/GFS/"/>
  </service>

  <datasetScan name="GFS" ID="GFSscan"
    path="GFS" location="/space/data/GFS">

    <metadata inherited="true">
      <serviceName>robDODS</serviceName>
    </metadata>

    <filter>
      <include wildcard="*"/>
    </filter>

  </datasetScan>

The tomcat installation is in a completely separate path to the
data.  The data is in /space/data/GFS.

When we access the data via thredds it unpacks it in the directory
where it resides.

[cermak@daved thredds]$ ls -lh /space/data/GFS
total 1.3G
-rw-r--r--  1 cermak users 306M Jul 20 20:56 gfs_20060720_00z.nc
-rw-r--r--  1 cermak users 306M Jul 20 20:56 gfs_20060720_06z.nc
-rw-r--r--  1 cermak users  22M Jul 20 20:56 gfs_20060720_12z.nc.bz2
-rw-r--r--  1 cermak users 306M Aug 11 11:37 gfs_20060720_18z.nc
-rw-r--r--  1 cermak users  48M Jul 20 20:57 gfs_20060720_18z.nc.gz

Yes, sorry I misspoke (miswrote?). It unpacks in same directory if it can. If 
it doesnt have write permission in the data direcory, then it uses the cache 
directory as I described.




It is possible to change the
place


I may not have discovered where this setting is.  By default, it
does not seem to be: content/thredds/cache/


but not yet the maximum size of the cache. Another possible
cache stategy is to delete files after a certain age (which makes

more sense for a rolling archive like motherlode).

Im going to make some/all of this user settable. what would you

like

to do?


Unpacking in place rather than a cache directory will simplify
things.  Thredds will have to keep an internal list of files it
recently unpacked.  I'm not an expert - whatever makes logical sense
to the developers is fine.  Being able to set an age or space
utilization watermark, cache location (if desired), would be good.

Understanding the caching mechanism will help us understand how to
structure the data on the server; allow enough space for a partition
for a cache directory or enough spare space for in place
uncompression.  As much as I like to keep things unpacked, to put
more data onto the server, we will have to compress things.  This is
a nice feature of opendap server 3.


If file ends with ".Z", ".zip", ".gzip", ".gz", or ".bz2", it will
uncompress/unzip and write to new file without the suffix.


It would be great to have the majority of compression methods!

I tried with .bz2 and it failed to recognize the file.

]DODServlet ERROR: Cant read
/space/data/GFS/gfs_20060720_12z.nc.bz2: not a valid NetCDF file.

You will need the new version, 3.12. Its on the web site, but we havent yet 
announced, still testing it.


We can follow up next week.  I can bring up the test server so we
can poke around.  Have a good weekend!

bon weekend!


==============================================================================
To unsubscribe thredds, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================

==============================================================================
To unsubscribe thredds, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================


  • 2006 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: