Re: [thredds] How are compressed netcdf4 files handled in TDS

To: thredds@xxxxxxxxxxxxxxxx
Subject: Re: [thredds] How are compressed netcdf4 files handled in TDS
From: John Caron <caron@xxxxxxxxxxxxxxxx>
Date: Mon, 25 Apr 2011 13:51:17 -0600

On 4/25/2011 1:46 PM, Peter Cornillon wrote:

On Apr 25, 2011, at 3:42 PM, John Caron wrote:
On 4/25/2011 1:37 PM, Roy Mendelssohn wrote:
yes, internal compression. All the files were made from netcdf3files using NCO with the options:
ncks -4 -L 1
The results so far show a decrease in file size from 40% of originalto 1/100 th of the original file size. If the internallycompressed data requests are cached differently than request tonetcdf3 files, we want to take that into account when we do thetests, so that we do not just see the affect of differential cacheing.
When we have done tests on just local files, the reads where about8 times slower from a compressed file. But Rich Signell has foundthat the combination of serialization/bandwidth is the bottleneck,and you hardly notice the difference in a remote access situation.That is what we want to find out, because we run on very littlemoney and with compression as mentioned above our RAIDS would go alot farther, as long the hit to the access time is not too great.
Thanks,

-Roy
in netcdf4/hdf5, compression is tied to the chunking. Each chunk isindividually compressed, and must be completely decompressed toretrieve even one value from that chunk. So the trick is to make yourchunks correspond to your "common cases" of data access. If thatspossible, you should find that compressed access is faster thannon-compressed access, because IO is smaller. but it will be highlydependent on that.
John, is there a loss of efficiency when compressing chunks comparedto compressing the entire file? I vaguely recall that for somecompression algorithms, compression efficiency is a function of thevolume of data compressed.
Peter


Hi Peter:

I think dictionary methods such as deflate get better as the file sizegoes up, but the tradeoff here is to try to decompress only the data youactually want. Decompressing very large files can be very costly.


John

Follow-Ups:
- Re: [thredds] How are compressed netcdf4 files handled in TDS
  - From: Peter Cornillon

References:
- [thredds] How are compressed netcdf4 files handled in TDS
  - From: Roy Mendelssohn
- Re: [thredds] How are compressed netcdf4 files handled in TDS
  - From: John Caron
- Re: [thredds] How are compressed netcdf4 files handled in TDS
  - From: Roy Mendelssohn
- Re: [thredds] How are compressed netcdf4 files handled in TDS
  - From: John Caron
- Re: [thredds] How are compressed netcdf4 files handled in TDS
  - From: Peter Cornillon

2011 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the thredds archives: