[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #NPP-336771]: Netcdf 4 question, parallel IO



Hi Annette,
 
> ... I have a question about large files (really huge files, actually,
> being created using parallel IO). I tried to write a big file yesterday,
> and bumped into a problem. My code was trying to create this field:
> 
> pressure(time, cells, layers)
> where:        cells dimension = 41943042
> layers dimension = 25
> The call to nf90_put_var() for this field generated a netcdf error.
> Ultimately, I need to be generating considerably larger fields than this
> example with netcdf4.

So assuming pressure is double rather than float, each record's worth
of pressure data would be about 8.4 GB.  That should not be a problem
if you are using one of the two netCDF-4 HDF5-based formats (either
netCDF-4 or netCDF-4 classic model).  However, in the netCDF classic
or 64-bit offset formats, record variables are restricted to 4 Gib per
record but may have up to 2**32 records.

To use one of the two the HDF5-based formats, you should be creating
the file in one of the two following ways:

  nf90_create(path, nf90_hdf5, ncid) 
or
  nf90_create(path, or(nf90_hdf5,nf90_classic_model), ncid) 

For more information about remaining size limits of netCDF files, see
this FAQ:

  "Have all size limits been eliminated?"
  http://www.unidata.ucar.edu/netcdf/docs/faq.html#Large%20File%20Support10

> My fortran 90 code uses integers to specify the start/count arguments to
> the nf90_put_var() function. Don't I need 8 byte integers for this?
> (But, my code won't compile correctly if I specify 8 byte integers for
> the start/count parameters.)

No, the start and count arguments for the Fortran functions can only
be 32-bit ints, because the Fortran functions call the C functions,
which also only accept 32-bit ints for dimension indices.  Note that
this doesn't restrict the size of a multidimensional variable, which
could use multiple 32-bit indexes along each of many dimensions to
access data of much larger size.

It's partly a portability issue, because data in a portable netCDF file
should be accessible on a 32-bit platform as well as a 64-bit platform.
So netCDF 64-bit-offset format files can be written or read on 32-bit
platforms through 32-bit interfaces to 32-bit libraries.  Here's more
information:

  "Why are variables still limited in size?"
  http://www.unidata.ucar.edu/netcdf/docs/faq.html#Large%20File%20Support11

There is some work almost finished from the parallel-netCDF group at
ANL to provide support for a new variant of the 64-bit-offset format
that will allow 64-bit indices and dimension sizes, making a fifth
netCDF format available, but it will take us some time to provide
support for that when it's released.  Here is information from that
group on the latest developments and status:

  http://trac.mcs.anl.gov/projects/parallel-netcdf/wiki/NewFileFormat
 
> We had compiled netcdf4 incorrectly, and discovered that we needed to
> specify this option:
> 
> -D_FILE_OFFSET_BITS=64
> 
> So, we've rebuilt netcdf4 with this option.
> 
> My application is written in Fortran 90. In my code, the variables that
> hold the start and count (arguments to the nf90_put_var() function) are
> integers (4 bytes by default). This is because the arguments to
> nf90_put_var() are integers. Is this going to work for very large files?
> (I don't think so.)

The size of each dimension of a netCDF variable in a classic or
64-bit-offset file must fit in 32 bits, but you can have a large
number of dimensions.

> Should I compile my code with an option that requires integers to be 8
> bytes?  If I do that, will netcdf4 need to be compiled differently?

No, it won't really help to require Fortran integers to be 8 bytes,
because of the underlying C library assumptions, unless you are using
one of the two HDF5-based formats.

--Russ

Russ Rew                                         UCAR Unidata Program
address@hidden                     http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: NPP-336771
Department: Support netCDF
Priority: Normal
Status: Closed