[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: On Largefile support in netCDF



>To: address@hidden
>From: Siddhartha S Ghosh <address@hidden>
>Subject: Re: 20041001: On Largefile support in netCDF
>Organization: NCAR SCD
>Keywords: LFS, _LARGE_FILE, _FILE_OFFSET_BITS, off_t

Hi Sid,

> ... [consider]
> The large-file support (file>2GB) both in 32 and 64
> bit version by default i.e. without users having to
> specify any MACRO.
> 
> Uptill now, I do not know what is the advantage of
> restricting it to files of size less than 2GB as every
> platform I know supports 64-bit integers. And the small
> increase in size of metadata anyway is negligible in
> comparison to any application requirements.
> 
> In case there is some advantage in some small set of
> platforms I would suggest that for those a MACRO option
> be given for restricting the filesize to 2GB.

You're right, Large File Support is now the norm (although it's only
the default on FreeBSD and MacOSX as far as I know), and we have made
it the default in netCDF 3.6.0 (available in the latest beta release).
It's not just a question of having 64-bit integers, but of having
64-bit off_t type for C programs even when ints, pointers and size_t
are all still 32 bits.  This is the most common programming model
(often referred to as the ILP32_OFFBIG model), for example on Linux
platforms on 32-bit processors.

It's also necessary to be compatible with other libraries, some of
which are only available for the ILP32_OFFBIG model.  Chaos happens if
you link two libraries into an application, one of which assumes
integers and pointers are 32 bits and the other assumes they are 64
bits.  A consistent model has to be used for compiling and linking all
the pieces of an application:

  Conforming applications shall not attempt to link together object
  files compiled for different programming models. Applications shall
  also be aware that binary data placed in shared memory or in files
  might not be recognized by applications built for other programming
  models. 
  http://www.opengroup.org/onlinepubs/000095399/utilities/c99.html

Finally, there are generally two different choices of programming
models that have 64-bit pointers, and although we use one as the
default we have to allow easy configuration to use the other model.
These correspond to the ILP32_OFFBIG model and the LP64_OFF64 model.
For example on Solaris, to get the first, you have to compile with
-D_FILE_OFFSET_BITS=64 whereas you get the second by instead compiling
with -xarch=v9.  Similarly on AIX you get the first by specifying the
environment variable OBJECT_MODE=64 and compiling with -D_LARGE_FILES,
whereas you get the second by unsetting OBJECT_FILES or setting it to
32 and compiling with -D_LARGE_FILES.  (By the way, it seems like a
bug to me that if on AIX you just set OBJECT_FILES=64 and forget to
use -D_LARGE_FILES, off_t is 64 bits but the wrong lseek() function
gets linked in so you can't actually write large files.)

We are choosing the ILP32_OFFBIG model as a default but documenting
how to compile for the LP_OFF64 model, which I hope is consistent with
what you suggest.

Thanks for the feedback!

--Russ