Re: [netcdfgroup] netcdfgroup Digest, Vol 176, Issue 1

Hi Ted,

I've found (through very unscientific testing) that a good rule of thumb for selecting chunk sizes for compression is to use a value that is a small multiple of the page size of your system (e.g. 32K-128K for a 4K page). So you might try using something like 119x119 if you are using float or double data types in your dataset.

BTW, if you use the default compression values, any attempt to read a time series at a specific lat/lon/level (assuming your data is organized as t/z/y/x) will require reading and decompressing the entire dataset since every zyx block has to be decompressed to extract a single time point.

Cheers, Joe

Howdy,

I started trying out netcdf 4.1.1-rc1 (with hdf5-1.8.4-patch1), and noticed that the write times from my application went up by a factor of 1.8 to 2 compared to the 20091122 snapshot. The problem showed up for larger files: say, 25MB of data (compressed) per record (in this case, about 40 3d arrays per time level with dimensions of 119x119x120). I did a test with a smaller data per record, and rc1 doesn't show any slowdown (perhaps even a little faster than 20091122).

I discovered that the problem is the default chunksizes: Ncdump -hs shows that the file created with 20091122 has _ChunkSizes = 1, 120, 119, 119 (the same as the actual dimensions), but rc1 is setting _ChunkSizes = 1, 102, 101, 101, which turns out to be much less efficient. When I altered my code to specify the chunksize to be the variable dimensions, the write time for rc1 went back down, too.

Any reason for the change in how the default chunksizes are set? Is there some arbitrary maximum default chunk dimension that got reduced?

Best,

-- Ted

__________________________________________________________
| Edward Mansell <ted.mansell@xxxxxxxx>
| National Severe Storms Laboratory
| 120 David L. Boren Blvd.
| Room 4354
| Norman, OK 73072
| Phone: (405) 325-6177    http://www.cimms.ou.edu/~mansell
| FAX: (405) 325-2316
|
| ----------------------------------------------------------------------------
|
| "The contents of this message are mine personally and
| do not reflect any position of the U.S. Government or NOAA."
|
| ----------------------------------------------------------------------------




  • 2010 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: