[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #WWQ-664381]: netcdf 4.0 filesize for large arrays



> BTW What are the default cache and chunk sizes?

In netCDF version 4.0, the default chunk cache size is
inherited from the HDF5 default, which is small, 1 MB
per variable.  I believe the current snapshot version has
changed this default to be at least the chunk size or
1 MB, whichever is larger.

Starting in 4.0.1, netCDF allows users to specify cache size
for each variable. This is a new feature, still being tested.
We may try to tune the default chunk and chunk cache sizes
better than the current algorithm, partly based on examples
such as yours.

For example, currently the default chunk size is 1 for each
unlimited dimension and the whole dimension size for other
dimensions, except that variables larger than 4 GB use
smaller chunks by default, dividing each dimension into the
smae number of chunks.  I think this often makes the chunks
too large.  The HDF5 group has advised us that a no default
chunk sizes will work well in all cases, but we may try to
do better than in the current release.  For your example of
accessing two-dimensional slices along each axis of a 3D
variable, I think you have come up with a good rule of thumb,
but I'm not sure it's ideal for other kinds of access, for
example 1D slices from a 3D variable.

We need to provide better guidance about this in the
documentation.  Please let us know if you have thoughts about
defaults for the general case or if you have seen any good
papers on how to determine a good algorithm based on access
patterns.  I'm Cc:ing Ed in case he has any corrections to
the above ...

--Russ



Thanks for pointing out -s swicth to ncdump, that would have life a lot
easier if I'd seen that earlier, and thanks for putting the hdf5 stuff
into netcdf, clearly looks like it has some potential :-)

Russ Rew                                         UCAR Unidata Program
address@hidden                     http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: WWQ-664381
Department: Support netCDF
Priority: High
Status: Closed