[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #PBW-682100]: nccopy deflation vs API deflation



Hi Nick,

> I've put some sample data files at:
> http://www2.epcc.ed.ac.uk/~njohnso1/netcdf/
> 
> validation2_nocomp.nc is from my (serial) code with no deflation.
> validation2_apicomp.nc is from the same code with deflation enabled.
> validation2_nccopyd9.nc is validation2_nocomp.nc after it's been processed 
> with nccopy -d9 -s
> validation2_mycopyd9.nc is validation2_nocomp.nc after it's been processed 
> with my own test compressor which copies the
> data to a new file where the variable has deflation enabled.

Thanks for those samples.  I tried to reproduce the problem here, and 
discovered that the
compression works fine with the current 4.2-rc1 release, then that it failed 
with the 4.1.3 
netCDF release, so it was evidently a bug fixed in the interim.  Then I found 
that *I* had
fixed the bug but forgotten about it:

  http://www.unidata.ucar.edu/support/help/MailArchives/netcdf/msg10259.html

so sorry for not noticing this earlier.  Anyway, the compression with nccopy 
-d9 now works
fine on your file, and if you don't want to get and build the 4.2 rc1 release 
candidate just to
get this bug fix, you should be able to just get the nccopy.c mentioned in the 
above support
email to replace your current nccopy.c and rebuild that.

Just out of curiosity, I also tried other compression levels with nccopy and 
found that -d 5 
worked better for compressing your sample file than any other level, though 
that may not 
be the case with other data:

support$ for i in 1 2 3 4 5 6 7 8 9; do
nccopy -d $i validation2_nocomp.nc validation2_nccopy_d$i.nc; ls -l 
validation2_nccopy_d$i.nc
done
-rw-rw-r-- 1 russ ustaff 1226706 Feb 26 10:06 validation2_nccopy_d1.nc
-rw-rw-r-- 1 russ ustaff 1217933 Feb 26 10:06 validation2_nccopy_d2.nc
-rw-rw-r-- 1 russ ustaff 1208676 Feb 26 10:06 validation2_nccopy_d3.nc
-rw-rw-r-- 1 russ ustaff 1184699 Feb 26 10:06 validation2_nccopy_d4.nc
-rw-rw-r-- 1 russ ustaff 1013395 Feb 26 10:06 validation2_nccopy_d5.nc
-rw-rw-r-- 1 russ ustaff 1017296 Feb 26 10:06 validation2_nccopy_d6.nc
-rw-rw-r-- 1 russ ustaff 1026669 Feb 26 10:06 validation2_nccopy_d7.nc
-rw-rw-r-- 1 russ ustaff 1046570 Feb 26 10:06 validation2_nccopy_d8.nc
-rw-rw-r-- 1 russ ustaff 1049421 Feb 26 10:06 validation2_nccopy_d9.nc

--Russ

> On 22/02/12 05:51, Unidata netCDF Support wrote:
> > Hi Nick,
> >
> >> I am running some tests with a code I am converting from using a flat
> >> file to netcdf/hdf5. I am using the parallel MPIIO access mode so unable
> >> to use the deflation calls via the API. I thought I would use nccopy
> >> -d9 as a post process on my files to compress them and therefore get
> >> some space saving whilst still retaining the ability to do a parallel
> >> read in other related codes.
> >>
> >> However, I find that I get quite poor compression using nccopy, much
> >> worse than I get if I use the API call. In some cases, nccopy -d9 gives
> >> little or no compression whilst using the API gives me 4-5x compression.
> >>
> >> Is this something you would expect or am I missing something critical
> >> in this case?
> >
> > No, you should expect exactly the same compression using nccopy as with the 
> > API calls.
> > nccopy calls the API for each variable in the file with whatever 
> > compression level you
> > specify.  The API calls are somewhat more flexible, in that you can specify 
> > a differnt level
> > of compression (or no compression) for each variable separately, but if you 
> > use the same
> > compression for every variable, there should be no differencce.
> >
> > If you are seeing something different, it sounds like a bug.  Can you 
> > provide a sample
> > file that we could use to reproduce the problem and diagnose the cause?
> >
> > --Russ
> >
> > Russ Rew                                         UCAR Unidata Program
> > address@hidden                      http://www.unidata.ucar.edu
> >
> >
> >
> > Ticket Details
> > ===================
> > Ticket ID: PBW-682100
> > Department: Support netCDF
> > Priority: Normal
> > Status: Closed
> >
> >
> 
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
> 
> 
Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: PBW-682100
Department: Support netCDF
Priority: Normal
Status: Closed