[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #FFY-177157]: NetCDF-4 and 64 bit dimensions



Hi Rob,

> Over at ANL we've been testing our proposed "CDF-5" file format for
> a while and now it's time for me to get serious about porting that
> work to NetCDF.
> 
> One thing I've noticed is that NetCDF4 has relaxed variable size
> limitations, but still addresses those variables with a size_t type:
> 
> int nc_put_vara_float (int ncid, int varid,
> const size_t start[], const size_t count[],
> const float *fp);
> 
> What happens in NetCDF-4 if someone wants to create a 1D variable of 5
> GB ?

They would have to do that on a platform on which size_t is larger
than 5 GB i.e. a 64-bit platform.  In that case, there's no problem,
as the size_t is typically a 64-bit unsigned quantity.

The resulting data would not be readable on a 32-bit platform, which
means the data would only be portable to other 64-bit platforms.  This
is an unfortunate disadvantage of supporting larger dimension sizes.
The 64-bit offset format of netCDF-3 still restricts each dimension
size to be 32 bits, but the netCDF-4 API and format have no such
restriction, as far as I know.

> In Parallel-NetCDF, we use an MPI_Offset type (a 64 bit type on all
> but the most ancient of platforms) in our API. The prototype for
> ncmpi_put_vara_float, for example looks like
> 
> int ncmpi_put_vara_float(int ncid, int varid,
> const MPI_Offset start[], const MPI_Offset count[],
> const float *op);
> 
> Note that I'm not speaking about the 'count' parameter here -- that's
> an entire other kettle of fish. I just mean how to describe the start
> of an access to a variable with one or more very large dimensions.
> Heck, for this example, count[] can be all 1s.
> 
> I think this example is not contrived, as the FLASH group has a
> workload where they track 4+ billion "things", and so would naturally
> use a variable with a 4+ billion dimension.

I'm not understanding the advantage of MPI_Offset over size_t for the
type of the count array.  For 32-bit platforms, you wouldn't want a
64-bit type for count, because no object such as an array can be
indexed by anything bigger than a size_t, which is defined as the size
of an object.

For 64-bit platforms, size_t seems ample as a type for the count[]
array.

Am I misinterpreting your question?

--Russ

Russ Rew                                         UCAR Unidata Program
address@hidden                     http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: FFY-177157
Department: Support netCDF
Priority: Normal
Status: Closed