Re: HDF5 dimension scales proposal

Robert E. McGrath wrote:

On 2004.11.10 10:29 John Caron wrote:


4. I think the main place where your proposal may fail to cover the general case is that you seem to require that a dimension scale is associated with a single dimension. But the general case is that it can be associated with several dimensions, eg lat(x,y). For that case, it makes more sense to associate a dimension scale with a dataspace. But then you still have to associate the dimensions of the data dataset with the dimensions of the dimension scale dataset. Giving the dimensions names and requiring their lengths to be the same would work, and would be an implementation of shared dimensions for the case of shared dimension scales.



This is not provided in the proposed design. Applications must implement
this semantics through their own conventions, I think.


It seems like you are missing a very important use case, namely lat(x,y), lon(x,y), where "lat" and "lon" are dimension scales that are each associated with the two shared dimensions "x" and "y". How would you propose to handle this in HDF5 with dimension scales?


This semantics is similar to coordinate systems.  This relationship
is up to an application or profile to define, and to define a storage
convention.

I'll turn the qurestion around:  how would you implement this without
native library support?  I assume there would be a convention for:

  * marking the dimension as being indexed by whatever x and y are
  * attributes to associate DS with x and y, and vice versa
  * code that understands this convention.

Yes, youre right that its similiar to coordinate systems.

We will probably implement with an attribute convention, eg:

      float data(time, x, y, x);
        data:_coordinates = "lat, lon";

      float lat(x,y);
      float lon(x,y);

where x,y will be shared dimensions.

So we dont need any new storage mechanisms.

IMO, coordinates are worth standardizing in full generality, so we stop having files that cant be properly displayed by standard software. If nothing else, a recomended way to do it with API support when useful. I guess I would advocate that our "common data model" effort eventually agree on some recomendations about how to do it.



From owner-netcdf-hdf@xxxxxxxxxxxxxxxx 10 2004 Nov -0700 12:21:21
Message-ID: <wrx654dh5n2.fsf@xxxxxxxxxxxxxxxxxxxxxxx>
Date: 10 Nov 2004 12:21:21 -0700
From: Ed Hartnett <ed@xxxxxxxxxxxxxxxx>
In-Reply-To: <4192560C.1090508@xxxxxxxxxxxxxxxx>
To: netcdf-hdf@xxxxxxxxxxxxxxxx
Subject: Re: HDF5 dimension scales proposal
Received: (from majordo@localhost)
        by unidata.ucar.edu (UCAR/Unidata) id iAAJLNLT027121
        for netcdf-hdf-out; Wed, 10 Nov 2004 12:21:23 -0700 (MST)
Received: from rodney.unidata.ucar.edu (rodney.unidata.ucar.edu 
[128.117.140.88])
        by unidata.ucar.edu (UCAR/Unidata) with ESMTP id iAAJLM7j027075
        for <netcdf-hdf@xxxxxxxxxxxxxxxx>; Wed, 10 Nov 2004 12:21:22 -0700 (MST)
Organization: UCAR/Unidata
Keywords: 200411101921.iAAJLM7j027075
References: <6.0.1.1.2.20041102105658.03edf3d0@xxxxxxxxxxxxxxxxx>
        <418D32F9.2070104@xxxxxxxxxxxxxxxx>
        <20041106145836.A2858@xxxxxxxxxxxxxxxxxxxxx>
        <419241CC.5020907@xxxxxxxxxxxxxxxx>
        <20041110103916.E19461@xxxxxxxxxxxxxxxxxxxxx>
        <4192560C.1090508@xxxxxxxxxxxxxxxx>
Lines: 69
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-netcdf-hdf@xxxxxxxxxxxxxxxx
Precedence: bulk
Reply-To: netcdf-hdf@xxxxxxxxxxxxxxxx

John Caron <caron@xxxxxxxxxxxxxxxx> writes:

Robert E. McGrath wrote:

> On 2004.11.10 10:29 John Caron wrote:
>
>>>>
>>>> 4. I think the main place where your proposal may fail to cover
>>>> the general case is that you seem to require that a dimension
>>>> scale is associated with a single dimension. But the general case
>>>> is that it can be associated with several dimensions, eg
>>>> lat(x,y). For that case, it makes more sense to associate a
>>>> dimension scale with a dataspace. But then you still have to
>>>> associate the dimensions of the data dataset with the dimensions
>>>> of the dimension scale dataset. Giving the dimensions names and
>>>> requiring their lengths to be the same would work, and would be
>>>> an implementation of shared dimensions for the case of shared
>>>> dimension scales.
>>>
>>>
>>>
>>> This is not provided in the proposed design.  Applications must
>>> implement
>>> this semantics through their own conventions, I think.
>>
>>
>> It seems like you are missing a very important use case, namely
>> lat(x,y), lon(x,y), where "lat" and "lon" are dimension scales that
>> are each associated with the two shared dimensions "x" and "y". How
>> would you propose to handle this in HDF5 with dimension scales?
>
>
> This semantics is similar to coordinate systems.  This relationship
> is up to an application or profile to define, and to define a storage
> convention.
>
> I'll turn the qurestion around:  how would you implement this without
> native library support?  I assume there would be a convention for:
>
>   * marking the dimension as being indexed by whatever x and y are
>   * attributes to associate DS with x and y, and vice versa
>   * code that understands this convention.

Yes, youre right that its similiar to coordinate systems.

We will probably implement with an attribute convention, eg:

       float data(time, x, y, x);
         data:_coordinates = "lat, lon";

       float lat(x,y);
       float lon(x,y);

where x,y will be shared dimensions.

So we dont need any new storage mechanisms.

IMO, coordinates are worth standardizing in full generality, so we
stop having files that cant be properly displayed by standard
software. If nothing else, a recomended way to do it with API support
when useful.  I guess I would advocate that our "common data model"
effort eventually agree on some recomendations about how to do it.

John,

Let's put some support for this in netcdf-4.

We need to figure out what the function API would look like.