> Hi Russ,
You are seeing artifacts of
> We have been looking at our netcdf read performance again, particularly with hdf4/hdf5 files.
> We do not have a clear story for the most part, but there seems to be a clear problem with compression in hdf5-based
> netcdf files.
> We would appreciate any insight.
- chunking with a chunk cache that's too small for the chunk shapes
used for compression
- poor default chunk shapes for early netCDF-4 version (4.1.2)
- measuring performance of ncdump, which is not optimized for
A chunk (or tile) is the smallest unit for HDF5 data compression and
access. The ncdump utility just uses the default chunk cache size,
which in netCDF version 4.1.2 was small (4194304 bytes). The
temperature variable in your test file has 9 chunks, each of size 1 x
1196 x 1196 shorts, so each chunk is 2860832 bytes. That means only 1
uncompressed chunk will fit in the default chunk cache. Reading all the
values in each row of 2500 values will read and uncompress 3 chunks, and
since the chunk cache only holds one of those chunks, the same chunks
will be re-read and uncompressed repeatedly until all the data is read!
I don't think ncdump is a very good program for testing read
performance. It was not designed to be high-performance, as it spends
much of its time comparing each value with a file value before
converting it to ASCII for formatting output a row at a time. The
ncdump utility doesn't have an option for specifying the size of chunk
cache to use for compressed files.
The nccopy utility is more appropriate for timing I/O with compression
and chunking, as it's designed to be efficient. It uses only the netCDF
library to read and write, so it's testing the efficiency of the netCDF
software. However, nccopy was not available for early versions of
netCDF-4, such as 4.0.1. Here's the current man page:
Later versions of netCDF, such as 4.2.x and 4.3.x have better default
chunking strategies, so perform better on your file. For example, in
netCDF 4.3.0, better chunk sizes are used (1 x 1250 x 1250) so there's
only 4 chunks rather than 9 chunks, and compression works better, even
with the same level of deflation:
$ nccopy -d1 spv.nc spv-d1.nc
$ ls -l spv-d1.nc
-rw-rw-r-- 1 russ ustaff 2832831 Nov 26 14:44 spv-d1.nc
which is better than the 3538143 bytes of the compressed file you sent.
And time for the above compression was about
A pretty good timing test for reading is to read, uncompress, and copy
the compressed file, using nccopy. Before running any such test, you
should make sure you aren't just reading a cached copy of the input file
in system memory. See "A note about timings" at the end of my blog
"Chunking Data: Why it Matters" for how to do this:
That blog also has some advice about choosing chunk shapes and sizes for
good performance. My follow-up blog, "Chunking Data: Choosing Shapes",
has more specific advice:
Anyway, here's how much time it takes to copy and uncompress the two
versions of your compressed file, writing a netCDF-3 classic file for
output. The first, uses the 1 x 1196 x 1196 chunks that were from the
old defaults in netCDF 4.1.2, and the second uses the 1 x 1250 x 1250
chunks that would be the default in the current netCDF release:
$ clear_cache.sh; time nccopy -d0 -k1 spv-199901011900_compressed.nc tmp.nc
$ clear_cache.sh; time nccopy -d0 -k1 spv-d1.nc tmp.nc
The tmp.nc uncompressed file is the same as your original uncompressed
file in each case.
And just for FYI, here's the time for running ncdump on the two
versions of the compressed data:
$ clear_cache.sh; time ncdump spv-199901011900_compressed.nc > /dev/null
$ clear_cache.sh; time ncdump spv-d1.nc > /dev/null
Both of those would be much faster if ncdump reserved enough chunk cache
in memory to hold all the chunks in a row of a variable when dumping it.
I could add that optimization option, if you really need ncdump to be
faster, but it would use a lot more memory than it does now.
> ---------- Forwarded message ----------
> From: Igor Khomyakov <address@hidden>
> Date: Thu, Nov 14, 2013 at 4:53 PM
> Subject: netcdf 4.1.2+ issue
> To: Benno Blumenthal <address@hidden>
> Cc: John del Corral <address@hidden>
> Benno, here's the test case for netcdf developers. Please let me know if you need more information. Attached, please
> find the sample data files and the strace log.
> THE DATA FILES: The compressed version of netcdf file was produced using nccopy (option -d). The uncompressed file is
> 12.5MB, the compressed file is 3.5MB. Attached, you may find datafiles.tgz that contains both data files.
> THE PROBLEM: ncdump 4.1.2+ of the compressed file takes 50 times more time than ncdump of the original netcdf file.
> Ncdump 4.0.1 doesn't appear to have this issue.
> $ time ncdump spv-199901011900.nc >/dev/null
> real 0m1.652s
> user 0m1.605s
> sys 0m0.017s
> $ time ncdump spv-199901011900_compressed.nc >/dev/null
> real 1m28.273s
> user 1m11.460s
> sys 0m16.681s
> THE STRACE LOG: we straced ncdump 4.1.2 of compressed file and found that it calls 'read' function 7,526 times, and
> reads 3,384,680,557 bytes! This is 1000 times more than the size of the file. Attached, please find the strace log.
> Dr. M. Benno Blumenthal address@hidden
> International Research Institute for climate and society
> The Earth Institute at Columbia University
> Lamont Campus, Palisades NY 10964-8000 (845) 680-4450
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.