[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20020208: netCDF question



>To: address@hidden
>From: Hugh Ellis <address@hidden>
>Subject: netCDF question
>Organization: UCAR/Unidata
>Keywords: 200202081954.g18Jsgx22705

Hi Hugh,

> We recently installed the latest version of the EPA air pollution modeling 
> system Models3 (Models3, Version 4.1).
> 
> One of the supposed benefits of this version is elimination of the 2Gb 
> filesize limit. I was wondering, however, whether that limit is still 
> present in netCDF. If not, in which netCDF version was the limit removed?

The limit is not completely eliminated, because we have not changed
the file format, merely the software using the format.

Version 3.4 removed most of the problems with supporting large files,
and version 3.5.0 corrected a few bugs with large file support.
Here's a short section on LFS that I wrote for a new revision of
the User's Guide.  I need to get this into the netCDF FAQ also:

  Large File Support 

  It is possible to write netCDF files that exceed 2 GB on platforms
  that have "Large File Support" (LFS). Such files would be
  platform-independent to other LFS platforms, but if you call nc_open
  to access data from such a file on an older platform without LFS, you
  would expect a "file too large" error. 

  There are important constraints on the structure of large netCDF files
  that result from the 32-bit relative offsets that are part of the
  netCDF file format:

  * If you don't use the unlimited dimension, only one variable can
    exceed 2 Gbytes in size, but it can be as large as the underlying
    file system permits. It must be the last variable in the dataset,
    and the offset to the beginning of this variable must be less than
    about 2 Gbytes. For example, the structure of the data might be
    something like:

    netcdf bigfile1 { 
      dimensions: 
        x=2000; 
        y=5000; 
        z=10000; 
      variables:
        double x(x); // coordinate variables 
        double y(y); 
        double z(z); 
        double var(x, y, z); // 800 Gbytes 
    }

  * If you use the unlimited dimension, any number of record variables
    may exceed 2 Gbytes in size, as long as the offset of the start of
    each record variable within a record is less than about 2
    Gbytes. For example, the structure of the data in a 2.4 Tbyte file
    might be something like:

    netcdf bigfile2 { 
      dimensions: 
        x=2000; 
        y=5000; 
        z=10; 
        t=UNLIMITED; // 1000 records, for example 
      variables: 
        double x(x); // coordinate variables 
        double y(y); 
        double z(z); 
        double t(t); 
                     // 3 record variables, 2.4 Gbytes per record 
        double var1(t, x, y, z); 
        double var2(t, x, y, z); 
        double var3(t, x, y, z); 
    }

--Russ

_____________________________________________________________________

Russ Rew                                         UCAR Unidata Program
address@hidden                     http://www.unidata.ucar.edu