Re: timing HDF5 1.6.1 code...

Ed,

> In netcdf the task is the reading (and, less importantly, writing) of
> large 2/3/4 dimensional arrays of floats/doubles/longs. By large we
> mean thousands of records of size range on the order of 100 for each
> other dimension. 

Small clarification: there is no netCDF type corresponding to long
(64-bit) integers.  The type "NC_LONG" is a deprecated synonym for
"NC_INT".

> For example, a 4D file with 2000 records, each of size 100x100x100,
> would have a total size of about 2GB, the maximum now available under
> 32-bit netcdf-3.x.

Just to clarify something not related to your benchmarking, if you are
discussing float or int data, you need another factor of 4, so the
above would be a total size of about 8GB.  That's OK, because the
netCDF-3.x 32-bit limit on offset size within a record does not limit
the number of records, so you can write an 80 GB netCDF file, for
example, using 20000 records, each with 100x100x100 floats.  This
assumes you are writing to a file system that supports large files.
The C and Fortran Users Guides currently incorrectly state that
netCDF files are limited to 2 Gbytes in size, but that is corrected in
the Fortran-90 Users Guide and in the FAQ "Is it possible to create
netCDF files larger than 2 Gbytes?":

  http://www.unidata.ucar.edu/packages/netcdf/faq.html#lfs

Related to this, we will soon make available a version of netCDF that
replaces the 32-bit offsets with 64-bit offsets in a backward
compatible way, since it is already being used in the parallel netCDF
effort.  The 64-bit version was developed and tested by Greg Sjaardema
of Sandia and adopted by the pnetcdf group.  We have incorporated the
mods for this into netCDF 3.6.0-alpha.  This is not meant to compete
with netCDF-4, but merely to provide an alternative for large files
until we develop and release netCDF-4.

--Russ
>From owner-netcdf-hdf@xxxxxxxxxxxxxxxx 16 2003 Dec -0700 08:38:57 
Message-ID: <wrxwu8w3h7y.fsf@xxxxxxxxxxxxxxxxxxxxxxx>
Date: 16 Dec 2003 08:38:57 -0700
From: Ed Hartnett <ed@xxxxxxxxxxxxxxxx>
To: netcdf-hdf@xxxxxxxxxxxxxxxx
Subject: And now for something completely different...interface for the new 
netcdf-4 feature of multiple dimensions.
Received: (from majordo@localhost)
        by unidata.ucar.edu (UCAR/Unidata) id hBGFcxVm005846
        for netcdf-hdf-out; Tue, 16 Dec 2003 08:38:59 -0700 (MST)
Received: from rodney.unidata.ucar.edu (rodney.unidata.ucar.edu 
[128.117.140.88])
        by unidata.ucar.edu (UCAR/Unidata) with ESMTP id hBGFcwp2005837
        for <netcdf-hdf@xxxxxxxxxxxxxxxx>; Tue, 16 Dec 2003 08:38:58 -0700 (MST)
Organization: UCAR/Unidata
Keywords: 200312161538.hBGFcwp2005837
Lines: 48
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-netcdf-hdf@xxxxxxxxxxxxxxxx
Precedence: bulk

At the risk of overwhelming everyone with email, here's the first cut
on the netcdf-4 interface for handling multiple dimensions. Hopefully
MTV did not disable *everyone's* attention span.

Current Interface (relevant prototypes)
---------------------------------------

EXTERNL int
nc_inq(int ncid, int *ndimsp, int *nvarsp, int *nattsp, int *unlimdimidp);

EXTERNL int 
nc_inq_unlimdim(int ncid, int *unlimdimidp);

EXTERNL int
nc_def_dim(int ncid, const char *name, size_t len, int *idp);

EXTERNL int
nc_inq_dim(int ncid, int dimid, char *name, size_t *lenp);

EXTERNL int 
nc_inq_dimname(int ncid, int dimid, char *name);

EXTERNL int 
nc_inq_dimlen(int ncid, int dimid, size_t *lenp);

EXTERNL int
nc_rename_dim(int ncid, int dimid, const char *name);

Changes for NETCDF-4
--------------------

I suggest we add one function:

EXTERNL int 
nc_inq_unlimdims(int ncid, int *numlimdimsp, int *unlimdimidsp);

This function will return the number of unlimited dimensions in
nunlimdimsp, and an array of their ids in unlimdimidsp.

Meanwhile, nc_inq_unlimdim will return the first defined unlimited
dimension, for backward compatibility.

Any suggestions, corrections, rejections, projections, dejections,
directions, restrictions, or digressions are welcome.

Thanks!

Ed

>From owner-netcdf-hdf@xxxxxxxxxxxxxxxx 16 2003 Dec -0700 08:42:33 
Message-ID: <wrxr7z43h1y.fsf@xxxxxxxxxxxxxxxxxxxxxxx>
Date: 16 Dec 2003 08:42:33 -0700
From: Ed Hartnett <ed@xxxxxxxxxxxxxxxx>
In-Reply-To: <200312161535.hBGFZbp2002281@xxxxxxxxxxxxxxxx>
To: Russ Rew <russ@xxxxxxxxxxxxxxxx>
Subject: Re: timing HDF5 1.6.1 code...
Received: (from majordo@localhost)
        by unidata.ucar.edu (UCAR/Unidata) id hBGFgbKL009678
        for netcdf-hdf-out; Tue, 16 Dec 2003 08:42:37 -0700 (MST)
Received: from rodney.unidata.ucar.edu (rodney.unidata.ucar.edu 
[128.117.140.88])
        by unidata.ucar.edu (UCAR/Unidata) with ESMTP id hBGFgYp2009669;
        Tue, 16 Dec 2003 08:42:34 -0700 (MST)
Organization: UCAR/Unidata
Keywords: 200312161542.hBGFgYp2009669
Cc: netcdf-hdf@xxxxxxxxxxxxxxxx
References: <200312161535.hBGFZbp2002281@xxxxxxxxxxxxxxxx>
Lines: 48
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-netcdf-hdf@xxxxxxxxxxxxxxxx
Precedence: bulk

Russ Rew <russ@xxxxxxxxxxxxxxxx> writes:

> Ed,
> 
> > In netcdf the task is the reading (and, less importantly, writing) of
> > large 2/3/4 dimensional arrays of floats/doubles/longs. By large we
> > mean thousands of records of size range on the order of 100 for each
> > other dimension. 
> 
> Small clarification: there is no netCDF type corresponding to long
> (64-bit) integers.  The type "NC_LONG" is a deprecated synonym for
> "NC_INT".

Yes, that's what I mean by long too, just like the C language
(usually). A long is a 4 byte int, just like an int.

A 64-bit int I would call a long long, as in the following declarion
I'm dying to use someday:

long long ago; /* in a galaxy far far away. */

> 
> > For example, a 4D file with 2000 records, each of size 100x100x100,
> > would have a total size of about 2GB, the maximum now available under
> > 32-bit netcdf-3.x.
> 
> Just to clarify something not related to your benchmarking, if you are
> discussing float or int data, you need another factor of 4, so the
> above would be a total size of about 8GB.  That's OK, because the

Duh! I knew I was making a stupid mistake there somewhere!

> netCDF-3.x 32-bit limit on offset size within a record does not limit
> the number of records, so you can write an 80 GB netCDF file, for
> example, using 20000 records, each with 100x100x100 floats.  This
> assumes you are writing to a file system that supports large files.
> The C and Fortran Users Guides currently incorrectly state that
> netCDF files are limited to 2 Gbytes in size, but that is corrected in
> the Fortran-90 Users Guide and in the FAQ "Is it possible to create
> netCDF files larger than 2 Gbytes?":
> 
>   http://www.unidata.ucar.edu/packages/netcdf/faq.html#lfs

Interesting! I didn't actually know that.

Ed