Re: Preliminary HDF5 Dimension documents

Quincey Koziol wrote:

Hi all,
   I've tried to condense my design information for extending HDF5 to have
dimension scales into something that is reasonably understandable.

   This document: 
ftp://hdf.ncsa.uiuc.edu/pub/outgoing/koziol/Shareable/Shareable.html
describes the concept of shareable objects in HDF5 and is important background
for the dimension scale design, which heavily relies on it.

   This document: 
ftp://hdf.ncsa.uiuc.edu/pub/outgoing/koziol/DimScale/DimScale.html
is a very rough description of where I'm at on the dimension scale design at
the moment.  The UML class diagrams describe things as I see them currently
and the large example at the end shows how many of the objects could be
stored in an actual file.

   I've also made available a presentation that I created last year for an
earlier version of the design: 
ftp://hdf.ncsa.uiuc.edu/pub/outgoing/koziol/DimScalePresent/index.html
The terms used in the examples diagrams are out of date and I've eliminated
the "tracking dimensions", but they are still useful for generally seeing how
the use cases would be stored in a file.  Also, the derived requirements are
still a good statement of what the dimension scales in HDF5 should support.

   Sorry that this is so late and so rough, I've been torn in several
directions during the last few weeks and haven't had enough time to completely
pull things together. :-/  I'll bring along printouts to share at the
HDF-EOS workshop in case Russ and Ed don't have time to get copies before
flying out.

   Quincey

Hi Quincey, some thoughts on your proposal:

1. A few notes on naming differences between the netCDF and HDF5 data model:
A netCDF *Variable* is a multidimensional array of primitive values, roughly corresponding to a HDF5 *Dataset.* A netCDF *Dimension *is a named array index. They are globally scoped, so can be shared. A Variable specifies its dimensionality by referencing a set of Dimensions, this set corresponds to an HDF5 *Dataspace. *There is no exact equivilence to a Dimension as i understand it. The fact that Variables can share Dimensions adds an important meaning to netCDF files. A netCDF *Coordinate Variable* is a 1D Variable whose name matches its dimension's name, and whose values are monotonic. This corresponds to your proposed *Dimension Scale*. Note that a netCDF Dimension describes array indices, whereas a Coordinate Variable / Dimension Scale describe coordinates values assigned to each index of the corresponding Dimension.

2. So, generally I like your Dimension Scale proposal. The main things we need are 1) shared Dimensions even when theres not a coordinate variable (perhaps a Dimension Scale without the values?), 2) each Dimension Scale must have a name; and 3) a Variable/Dataset can specify its dimensionality/Dataspace by listing the Dimensions (or their names).

3. While 1D Coordinate Variables / Dimension Scales are the common case, there are also datasets that need different kinds of coordinate systems, including multidimensional coordinate variables. I am eager that netCDF / HDF5 can support these, but I think they can be built on top of the current functionality, and so we can leave them out of this discussion so as to keep things from getting too complicated. (for more details on those ideas, see chapter 3.1 of the java-netcdf user manual).



From owner-netcdf-hdf@xxxxxxxxxxxxxxxx 29 2003 Sep -0600 06:57:11
Message-ID: <wrxvfrblplk.fsf@xxxxxxxxxxxxxxxxxxxxxxx>
Date: 29 Sep 2003 06:57:11 -0600
From: Ed Hartnett <ed@xxxxxxxxxxxxxxxx>
In-Reply-To: <3F760C7F.6060403@xxxxxxxxxxxxxxxx>
To: John Caron <caron@xxxxxxxxxxxxxxxx>
Subject: Re: Preliminary HDF5 Dimension documents
Received: (from majordo@localhost)
        by unidata.ucar.edu (UCAR/Unidata) id h8TCvEd2025566
        for netcdf-hdf-out; Mon, 29 Sep 2003 06:57:14 -0600 (MDT)
Received: from rodney.unidata.ucar.edu (rodney.unidata.ucar.edu 
[128.117.140.88])
        by unidata.ucar.edu (UCAR/Unidata) with ESMTP id h8TCvCk1025478;
        Mon, 29 Sep 2003 06:57:12 -0600 (MDT)
Organization: UCAR/Unidata
Keywords: 200309291257.h8TCvCk1025478
Cc: netcdf-hdf@xxxxxxxxxxxxxxxx
References: <200309201930.h8KJUPbe045013@xxxxxxxxxxxxxxxxxxxxxx>
        <3F760C7F.6060403@xxxxxxxxxxxxxxxx>
Lines: 14
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-netcdf-hdf@xxxxxxxxxxxxxxxx
Precedence: bulk

John Caron <caron@xxxxxxxxxxxxxxxx> writes:

2. So, generally I like your Dimension Scale proposal. The main things
we need are 1) shared Dimensions even when theres not a coordinate
variable (perhaps a Dimension Scale without the values?), 2) each

I wonder, though, why we need this in HDF.

If we have a shared dimension without coordinate values, how is that
different from an unshared dimension without coordinate values (which
is what HDF currently provides?)


Ed