[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #UNP-839629]: deflating in NetCDF-4 Classic mode via ncgen



Howdy Don!

You surprise me! The only difference between a netCDF-4 and netCDF-4 classic 
file is that the classic files has one extra global attribute in the HDF5 file, 
which is not exposed in the netCDF API (except as the format information). So a 
h5dump of the two files should be identical, except for this one attribute.

Therefore I would expect them to compress to the same size, if the compression 
level is the same in each case.

With respect to your chunking, you are optimizing for writing (if you are 
writing in slices of 540 x 361 x 1). If someone reads the data along the other 
dimensions (ex. slice of 1 x 1 x 1460), then every chunk must be read and 
uncompressed. A more reader-friendly approach would be to chunk as a proportion 
of your data space, for example a tenth of the total index space in each chunk. 
(54 x 37 x 146). For floating point, this yields a chunksize of a little over 1 
MB, which is very reasonable.

You must decide whether to optimize for reading, writing, or something in 
between. ;-)

The netCDF library must make some default choices if you don't specify chunk 
sizes, and it will pick a value of 1 for any unlimited dimension, because it 
just can't tell whether you are going to write thousands or only 1 value along 
that dimension. But it is not the best performance choice. If you know that the 
unlimited dimension is going to be larger than about 10, it's worthwhile to set 
the chunk size larger in this dimension to improve overall performance.

Thanks,

Ed

Ticket Details
===================
Ticket ID: UNP-839629
Department: Support netCDF
Priority: Normal
Status: Closed