[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #UBZ-355729]: NetCDF 64-bit offset bug with large records



Hi Chuck,

> We have found what we believe is a bug in NetCDF support for large
> record variables in 64-bit offset files, and we're hoping for
> confirmation and perhaps a workaround. :) By "large" records we mean
> each record is larger than 4 GB, knowing that only one record variable
> can have such records and it must be the last variable, per this
> documented limitation:"No record variable can require more than 2^32 - 4
> bytes of storage for each record's worth of data, unless it is the last
> record variable."
> 
> The problem can be demonstrated with a slightly altered large_files.c
> test included with NetCDF, which I have attached. There are three
> alterations: 1) switch the order of definition of the 'x' and 'var1'
> variables (to make var1 the last variable, allowing its records to be
> larger than 4 GB); 2) change I_LEN to 5104, making var1's records ~5 GB
> each; 3) comment out the deletion of the test file at the end.
> 
> If the number of records (NUMRECS) is 1, everything is fine. But with 2
> records, the test fails: "Error on read, var1[0, 4104, 12, 3] = 42
> wrong, should be -1 !" You can tell something is wrong without even
> looking at the file's contents because the two-record file is too small.
> With two records the file should be just under 10 GB in size. In fact
> its size is about 9 GB. In tests with our own software, we find that the
> second record always adds just under 4 GB to the file, even if the
> record is much larger (~12 GB in our case).
> 
> While investigating with our own code, one of our developers followed
> the code of a reader in a debugger and observed the following:
> 
> When a netcdf file is opened for reading via nc_open, it eventually
> reads the size of each variable via ncx_get_size_t, which assumes
> size_t is 4 bytes (even though size_t is actually 8 bytes on this
> machine). There is a corresponding write routine ncx_put_size_t,
> which also assumes size_t is 4 bytes, and presumably is used when
> the file is written. Later while opening the file, this size is used
> in NC_computeshapes. Finally when reading data from the file, this
> size leads to an error in NC_varoffset, which is called deep inside
> nc_get_vara_float.
> 
> We see this problem with 3.6.2 and 4.1.1. The known issues that have
> been published do not mention this problem, so we assume this is either
> a problem with how we are building NetCDF, or it is a bug which is not
> corrected in newer releases. For a variety of reasons, switching to the
> HDF5 format isn't really an option for us. Can you confirm/reproduce
> this issue? We've been using NetCDF for quite a while, so we're pretty
> confident our builds are correct, but perhaps this is something which
> only makes itself known when working with very large records.

I've verified that your test fails the same way in the new 4.2 release, so
it's a bug.  Thanks much for providing the test case.  I won't be able to
look at the fix until next week, but I'll let you know when we have a patch
to test.

--Russ

Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: UBZ-355729
Department: Support netCDF
Priority: High
Status: Closed


NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.