RE: 4GigB variable size limit

> Are there important use cases for needing a single dimension length
> greater than 2^32?  An example of this much data would be a time
> series of a single measurement taken every 0.01 seconds for more than
> 500 days.

Or a digital scope measuring 500MHz for 8 seconds!

In reality for my needs, tomography data sets, supporting >4GB in a
single, multidimensional variable is OK.  A single dimension can be less
than 2^32. 

Mark


> -----Original Message-----
> From: owner-netcdfgroup@xxxxxxxxxxxxxxxx 
> [mailto:owner-netcdfgroup@xxxxxxxxxxxxxxxx] On Behalf Of Russ Rew
> Sent: Wednesday, May 02, 2007 12:00 PM
> To: Katie Antypas
> Cc: netcdfgroup@xxxxxxxxxxxxxxxx
> Subject: Re: 4GigB variable size limit 
> 
> Hi,
> 
> Katie Antypas <kantypas@xxxxxxx> wrote:
> > I'm jumping into the discussion late here, but coming from 
> a perspective 
> > of trying to find and develop an IO strategy which will work at the 
> > petascale level, the 4 GigB variable size limitation is a major 
> > barrier.  Already a 1000^3 grid variable can not fit into a single 
> > netcdf variable.  Users at NERSC and other supercomputing centers 
> > regularly run problems of this size or greater and IO 
> demands are only 
> > going to get bigger.  We don't believe chopping up data 
> structures into 
> > pieces is a good long term solution or strategy.  There 
> isn't a natural 
> > way to break up the data and chunking eliminates the 
> elegance, ease and 
> > purpose of a parallel IO library.  Besides the direct code changes, 
> > analytics and visualization tools become more complicated 
> as datafiles 
> > from the same simulation but of different sizes would not 
> have the same 
> > number variables.  Restarting a simulation from a 
> checkpoint file on a 
> > different number of processors would also become more convoluted.
> > 
> > The view from NERSC is that if Parallel-NetCDF is to be 
> viable option 
> > for users running large parallel simulations, this is a 
> limitation that 
> > must be lifted...
> 
> First a minor correction: a 1000^3 grid variable *can* fit into a
> single netCDF variable if it's of type float, int, or a smaller type.
> In fact a 1023^3 grid variable is still within the limits of 4GiB for
> a single variable size.  For a record variable, the size could be up
> to numrecs*1023^3, since the limit of 4 GiB is only on each record's
> worth of data for a record variable, and you could have a large number
> of variables of this size in the same netCDF file.
> 
> However, we're very sympathetic with the intent of the above request,
> to remove current 4GiB variable size limitations in the netCDF format,
> and we're discussing the possibility of this with the parallel netCDF
> developers.  It may be possible to remove such restrictions without
> changing the CDF2 format.  To do this without changing the format
> would also require that variables larger than 4GiB have more than one
> dimension, since removing the current dimension size restriction of
> 2^32-1 *would* require a format change.  Also, netCDF files created
> with large variables (> 4GiB) might not be portable to 32-bit
> platforms.
> 
> Of course another option is using netCDF-4 when it's out of beta,
> because it has no 4GiB limit on variable size.
> 
> Are there important use cases for needing a single dimension length
> greater than 2^32?  An example of this much data would be a time
> series of a single measurement taken every 0.01 seconds for more than
> 500 days.  As I mentioned above and as John Caron pointed out,
> supporting dimensions larger than 2^32 would be a much bigger deal,
> requiring a new format, making data inaccessible on 32-bit platforms,
> and even causing problems in other language interfaces such as the
> Java interface.
> 
> --Russ
> 
> _____________________________________________________________________
> 
> Russ Rew                                         UCAR Unidata Program
> russ@xxxxxxxxxxxxxxxx                     http://www.unidata.ucar.edu
> 
> =============================================================
> ================
> To unsubscribe netcdfgroup, visit:
> http://www.unidata.ucar.edu/mailing-list-delete-form.html
> =============================================================
> ================
> 
> 

==============================================================================
To unsubscribe netcdfgroup, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================


  • 2007 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: