[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #UBZ-355729]: NetCDF 64-bit offset bug with large records



Hi Chuck,

> We have found what we believe is a bug in NetCDF support for large
> record variables in 64-bit offset files, and we're hoping for
> confirmation and perhaps a workaround. :) By "large" records we mean
> each record is larger than 4 GB, knowing that only one record variable
> can have such records and it must be the last variable, per this
> documented limitation:"No record variable can require more than 2^32 - 4
> bytes of storage for each record's worth of data, unless it is the last
> record variable."
> 
> The problem can be demonstrated with a slightly altered large_files.c
> test included with NetCDF, which I have attached. There are three
> alterations: 1) switch the order of definition of the 'x' and 'var1'
> variables (to make var1 the last variable, allowing its records to be
> larger than 4 GB); 2) change I_LEN to 5104, making var1's records ~5 GB
> each; 3) comment out the deletion of the test file at the end.
> 
> If the number of records (NUMRECS) is 1, everything is fine. But with 2
> records, the test fails: "Error on read, var1[0, 4104, 12, 3] = 42
> wrong, should be -1 !" You can tell something is wrong without even
> looking at the file's contents because the two-record file is too small.
> With two records the file should be just under 10 GB in size. In fact
> its size is about 9 GB. In tests with our own software, we find that the
> second record always adds just under 4 GB to the file, even if the
> record is much larger (~12 GB in our case).
> 
> While investigating with our own code, one of our developers followed
> the code of a reader in a debugger and observed the following:
> 
> When a netcdf file is opened for reading via nc_open, it eventually
> reads the size of each variable via ncx_get_size_t, which assumes
> size_t is 4 bytes (even though size_t is actually 8 bytes on this
> machine). There is a corresponding write routine ncx_put_size_t,
> which also assumes size_t is 4 bytes, and presumably is used when
> the file is written. Later while opening the file, this size is used
> in NC_computeshapes. Finally when reading data from the file, this
> size leads to an error in NC_varoffset, which is called deep inside
> nc_get_vara_float.
> 
> We see this problem with 3.6.2 and 4.1.1. The known issues that have
> been published do not mention this problem, so we assume this is either
> a problem with how we are building NetCDF, or it is a bug which is not
> corrected in newer releases. For a variety of reasons, switching to the
> HDF5 format isn't really an option for us. Can you confirm/reproduce
> this issue? We've been using NetCDF for quite a while, so we're pretty
> confident our builds are correct, but perhaps this is something which
> only makes itself known when working with very large records.

I've verified that your test fails the same way in the new 4.2 release, so
it's a bug.  Thanks much for providing the test case.  I won't be able to
look at the fix until next week, but I'll let you know when we have a patch
to test.

--Russ

Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: UBZ-355729
Department: Support netCDF
Priority: High
Status: Closed