Hi Mark, At your convenience, build the current snapshot of netCDF from the source at: ftp://ftp.unidata.ucar.edu/pub/netcdf/snapshot/netcdf-4-daily.tar.gz Then see if the new nccopy does the job for you. I've done some timings, which appear below. As input, I used a netCDF-3 classic file of about 17.5 GB that contains the 1698x1617x1596 variable last (which is required in classic format for variables to exceed 4 GB). Using nccopy to convert the file to a netCDF-4 contiguous (unchunked) file took about 15 minutes on my desktop Linux system: $ /usr/bin/time ./nccopy -k4 cvx3.nc cvx4.nc; ls -l cvx[34].nc 13.40user 52.60system 14:54.69elapsed 7%CPU (0avgtext+0avgdata 11120maxresident)k 34284824inputs+68470264outputs (60major+6283minor)pagefaults 0swaps -rw-rw-r-- 1 russ ustaff 17528362416 Jul 13 13:43 cvx3.nc -rw-rw-r-- 1 russ ustaff 17528380036 Jul 13 14:10 cvx4.nc To copy and compress at level 1, using the default 100x96x94 chunks took about 14 minutes and shows the resulting file compressed to about 6.1 GB: $ /usr/bin/time ./nccopy -k4 -d1 cvx3.nc cvx4-d1.nc; ls -l cvx4-d1.nc 506.78user 20.34system 14:05.81elapsed 62%CPU (0avgtext+0avgdata 1359236maxresident)k 34252976inputs+11967392outputs (74major+1043484minor)pagefaults 0swaps -rw-rw-r-- 1 russ ustaff 6126519401 Jul 13 14:29 cvx4-d1.nc To instead do what you need, reshaping to chunks oriented along the time dimension of shape 1698x25x24, took about the same amount of time by specifying a chunk cache of 18 GB and lots of items in the chunk cache. You can also see that with these chunks, the data did not compress quite as well, but your mileage may vary. $ /usr/bin/time ./nccopy -k4 -d1 -h 18G -e10001 -m 1G -c time/1698,latitude/25,longitude/24 cvx3.nc cvx4-time.nc; ls -l cvx4-time.nc 567.68user 44.88system 14:28.54elapsed 70%CPU (0avgtext+0avgdata 18348304maxresident)k 32663816inputs+12791808outputs (57major+10639640minor)pagefaults 0swaps -rw-rw-r-- 1 russ ustaff 6543504177 Jul 14 06:40 cvx4-time.nc If I instead left off the new -h and -e arguments to nccopy and just used the nccopy defaults of a 4M cache and 1009 cache elements, the above nccopy ran for more than 12 hours and still hadn't even completed half of the output. An alternative for reshaping chunks in large data files is to use the HDF5 tool h5repack. However, I can't recommend that currently, as an nccopy test similar to your case but with a smaller number of times rechunked a 3GB file in 2.5 minutes but took 6.25 hours using h5repack. There are no options for controlling the chunk cache with h5repack, which may explain why it took 150x longer than nccopy in this case. --Russ Russ Rew UCAR Unidata Program address@hidden http://www.unidata.ucar.edu Ticket Details =================== Ticket ID: CUV-251255 Department: Support netCDF Priority: Normal Status: Closed
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.