Large File Support (LFS) refers to operating system and C library facilities to support files larger than 2 GiB. On many 32-bit platforms the default size of a file offset is still a 4-byte signed integer, which limits the maximum size of a file to 2 GiB. Using LFS interfaces and the 64-bit file offset type, the maximum size of a file may be as large as 263 bytes, or 8 EiB. For many current platforms, large file macros or appropriate compiler flags have to be set to build a library with support for large files. This is handled automatically in netCDF 3.6.
When the netCDF format was created in 1988, 4-byte fields were reserved for file offsets, specifying where the data for each variable started relative to the beginning of the file or the start of a record boundary.
This first netCDF format variant, the only format supported in versions 3.5.1 and earlier, is referred to as the netCDF classic format. The 32-bit file offset in the classic format limits the total sizes of all but the last non-record variables in a file to less than 2 GiB, with a similar limitation for the data within each record for record variables. The netCDF classic format is also identified as version 1 or CDF1 in reference to the format label at the start of a file.
With netCDF 3.6, a second variant of netCDF format is now supported in addition to the classic format. The new variant is referred to as the 64-bit offset format, version 2, or CDF2. The primary difference from the classic format is the use of 64-bit file offsets instead of 32-bit offsets, but it also supports larger variable and record sizes.
No, version 3.6 of the netCDF library detects which variant of the format is used for each file when it is opened for reading or writing, so it is not necessary to know which variant of the format is used. The version of the format will be preserved by the library on writing. If you want to modify a classic format file to use the 64-bit offset format so you can make it much larger, you will have to create a new file and copy the data to it.
Yes, the 3.6 library and all planned future versions of the library will continue to support reading and writing files using the classic (32-bit offset) format as well as the new 64-bit offset format. There is no need to convert existing archives from the classic to the 64-bit offset format. Even netCDF-4, which will introduce a third variant of the netCDF format based on HDF5, will continue to support accessing classic format netCDF files as well as 64-bit offset netCDF files.
No, we discourage users from making use of the new format unless they need it for very large files. It may be some time until third-party software that uses the netCDF library is upgraded to 3.6 or later versions that support the new large file facilities, so we advise continuing to use the classic netCDF format for data that doesn't require huge file offsets. The library makes this recommendation easy to follow, since the default for file creation is the classic format.
The short answer is that under most circumstances, you should not care, if you use version 3.6.0 or later of the netCDF library. But the difference is indicated in the first four bytes of the file, which are 'C', 'D', 'F', '\001' for the classic netCDF format and 'C', 'D', 'F', '\002' for the new 64-bit offset format. On a Unix system, one way to display the first four bytes of a file, say foo.nc, is to run the following command:
od -An -c -N4 foo.ncwhich will output
C D F 001or
C D F 002depending on whether foo.nc is a classic or 64-bit offset netCDF file, respectively.
The application will indicate an error trying to open the file and present an error message equivalent to "not a netCDF file". This is why it's a good idea not to create 64-bit offset netCDF files until you actually need them.
Yes, by specifying the appropriate file creation flag you can create 64-bit offset netCDF files the same way on 32-bit platforms as on 64-bit platforms.
With netCDF version 3.6.0 or later, use the NC_64BIT_OFFSET flag when you call nc_create(), as in:
err = nc_create("foo.nc", NC_NOCLOBBER | NC_64BIT_OFFSET, &ncid);
In Fortran-77, use the NF_64BIT_OFFSET flag when you call nf_create(), as in:
iret = nf_create('foo.nc', IOR(NF_NOCLOBBER,NF_64BIT_OFFSET), ncid)
In Fortran-90, use the NF90_64BIT_OFFSET flag when you call nf_create(), as in:
iret = nf90_create(path="foo.nc", cmode=or(nf90_clobber,nf90_64bit_offset), ncid=ncFileID)
In C++, use the Offset64Bits enum in the NcFile constructor, as in:
NcFile nc("foo.nc", FileMode=NcFile::New, FileFormat=NcFile::Offset64Bits);
A new flag, '-v', has been added to ncgen to specify the file format variant. By default or if '-v 1' or '-v classic' is specified, the generated file will be in netCDF classic format. If '-v 2' or '-v 64-bit-offset' is specified, the generated file will use the new 64-bit offset format. To permit creating very large files quickly, another new ncgen flag, '-x', has been added to specify use of nofill mode when generating the netCDF file.
No, there are still some limits on sizes of netCDF objects, even with the new 64-bit offset format. Each fixed-size variable and the data for one record's worth of a record variable are limited in size to a little less that 4 GiB, which is twice the size limit in versions earlier than netCDF 3.6.
The maximum number of records remains 232-1.
While most platforms support a 64-bit file offset, many platforms only support a
32-bit size for allocated memory blocks, array sizes, and memory pointers.
In C developers jargon, these platforms have a 64-bit
type for file offsets, but a 32-bit
size_t type for size of
arrays. Changing netCDF to assume the 64-bit
on 64-bit platforms would make it suitable only for 64-bit platforms.
We expect to be able to remove remaining variable size constraints with netCDF-4 using the HDF5 format, but that won't be released until mid-2005.
There are several possible reasons why creating a large file can fail that are not related to the netCDF library:
dd if=/dev/zero bs=1000000 count=3000 of=./testwhich should write a 3 GB file named "test" in the current directory.
If you get the netCDF library error "One or more variable sizes violate format constraints", you are trying to define a variable larger than permitted for the file format variant. This error typically occurs when leaving "define mode" rather than when defining a variable. The error cannot necessarily be determined when a variable is first defined, because the last fixed-size variable is permitted to be larger than other fixed-size variables when there are no record variables. Similarly, the last record variable may be larger than other record variables. This means that subsequently adding a small variable to an existing file may be invalid, because it makes what was previously the last variable now in violation of the format size constraints. For details on the format size constraints, see the Users Guide sections NetCDF Classic Format Limitations and NetCDF 64-bit Offset Format Limitations.
If you get the netCDF library error "Invalid dimension size", you are exceeding the size limit of netCDF dimensions, which must be less than 2,147,483,644 for classic files with no large file support and otherwise less than 4,294,967,292.
No, except that 32-bit applications should link with a 32-bit version of the library and 64-bit applications should link with a 64-bit library, similarly to use of other libraries that can support either a 32-bit or 64-bit model of computation.
No, classic files created with the new library should be compatible with all older applications, both for reading and writing, with one minor exception. The exception is due to a correction of a netCDF bug that prevented creating records larger than 4 GiB in classic netCDF files with software linked against versions 3.5.1 and earlier. This limitation in total record size was not a limitation of the classic format, but an unnecessary restriction due to the use of too small a type in an internal data structure in the library. If you want to always make sure your classic netCDF files are readable by older applications, make sure you don't exceed 4 GiBytes for the total size of a record's worth of data. (All records are the same size, computed by adding the size for a record's worth of each record variable, with suitable padding to make sure each record begins on a byte boundary divisible by 4.)