[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20030915:File Offset questions related to 2GB dataset sizes



Greg Sjaardema wrote:

Attached is a patch file and a tar file of the patched source for my initial version of the modifications to netcdf-3.5.1-beta13 to support a 64-bit offset field. The modifications were made such that the patched library can read/write both files with both a 32-bit offset (compatible with current netcdf) and a 64-bit offset ("new" format).

The "new" format starts with the magic string "CDF2" instead of "CDF1". On file create, it is specified by passing the flag NC_64BIT_OFFSET in the mode field of the nc_create call. On read, the library queries the magic string and determines whether format '1' or '2' is present in the file. This seems to work pretty good in our environment, but perhaps there is a better method.

this seems fine to me



I assume that the library will be compiled with -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE and there are some asserts that check this, but I may need some better error checking eventually.... I think it is also possible to fix the code so that "new" files could be read/written on systems with 32-bit offsets if the offset was less thann 2^31, but I haven't looked into that.

probably not necessary; we dont need to save a few bytes in files this size.



If you've got any suggestions or criticisms of the code, let me know. I've been using it for a couple weeks and haven't noticed any problems, but my tests are exclusively targeted at the way we use netcdf. We were limited to finite element meshes of approximately 44 million elements with the standard netcdf. I have created meshes of 150 million elements with the "new" version and can go even larger with some modifications to the way we use netcdf.

Thanks for the advice and steering me in this direction. It looks like it will have minimal impact on our software that uses netcdf; most of the time we will be able to just relink with new libraries (netcdf and our exodusII).

--Greg Sjaardema

Hi Greg: this is very cool!

Referring to the "Netcdf File Format Specification" document:

http://www.unidata.ucar.edu/packages/netcdf/guidec/guidec-18.html#HEADING18-0

  var     := name  nelems  [dimid ...]  vatt_array  nc_type  vsize  begin

I assume that you just changed "vsize" and "begin" to 64-bit integers? Was there anything else you needed to do?

(Im working on the Java library, and im rather more interested in the resulting file format than the C library.)

Could you send me a sample file as your code writes it? If possible, perhaps less than 2 Gb ;^}

thanks,

John