Re: [netcdfgroup] default chunks in 4.1.1-rc1

For high resolution grids, a default chunk size equal to the full grid dimension (in all dimensions) results in a data set that is essentially unreadable because it requires a ridiculous amount of memory for the caching to work properly. The optimal cache and chunk sizes are really dependent on the data set in question and the hardware in use ... it's difficult to set defaults that will please all the users all the time. The NetCDF API makes it pretty easy to give the user control of the cacheing/chunking options to tune the application to work best for the data set in use.
--Jennifer

--
Jennifer M. Adams
IGES/COLA
4041 Powder Mill Road, Suite 302
Calverton, MD 20705
jma@xxxxxxxxxxxxx


On Mar 11, 2010, at 4:00 PM, Ted Mansell wrote:


Howdy,

I started trying out netcdf 4.1.1-rc1 (with hdf5-1.8.4-patch1), and noticed that the write times from my application went up by a factor of 1.8 to 2 compared to the 20091122 snapshot. The problem showed up for larger files: say, 25MB of data (compressed) per record (in this case, about 40 3d arrays per time level with dimensions of 119x119x120). I did a test with a smaller data per record, and rc1 doesn't show any slowdown (perhaps even a little faster than 20091122).

I discovered that the problem is the default chunksizes: Ncdump -hs shows that the file created with 20091122 has _ChunkSizes = 1, 120, 119, 119 (the same as the actual dimensions), but rc1 is setting _ChunkSizes = 1, 102, 101, 101, which turns out to be much less efficient. When I altered my code to specify the chunksize to be the variable dimensions, the write time for rc1 went back down, too.

Any reason for the change in how the default chunksizes are set? Is there some arbitrary maximum default chunk dimension that got reduced?

Best,

-- Ted

__________________________________________________________
| Edward Mansell <ted.mansell@xxxxxxxx>
| National Severe Storms Laboratory
| 120 David L. Boren Blvd.
| Room 4354
| Norman, OK 73072
| Phone: (405) 325-6177    http://www.cimms.ou.edu/~mansell
| FAX: (405) 325-2316
|
| ----------------------------------------------------------------------------
|
| "The contents of this message are mine personally and
| do not reflect any position of the U.S. Government or NOAA."
|
| ----------------------------------------------------------------------------





_______________________________________________
netcdfgroup mailing list
netcdfgroup@xxxxxxxxxxxxxxxx
For list information or to unsubscribe,  visit: 
http://www.unidata.ucar.edu/mailing_lists/




  • 2010 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: