Re: coordinate systems

John Sheldon (jps@GFDL.GOV)
Tue, 22 Jul 1997 12:33:22 -0400 (EDT)

Hi Stephen-

> I have attached a postscript plot to which the following discussion refers.
> Please let me know if you cannot view it or print it.

I was able to view it, no sweat.  And it illustrates the problem quite
well.  Is there some way to do this in ASCII so that others could see
it as part of some mail?  Hmmm, something like...

    -             * 3,4
    |                    * 2,4
  3 -           * 3,3         * 1,4
    |                 * 2,3
  2 -        * 3,2          * 1,3 <=== (j,i) (C-wise)
    |              * 2,2
  1 -     * 3,1          * 1,2
    |           * 2,1
    |                  1,1
    +-----|-----|-----*-----|--- X
          1     2     3     4


    -              * 
    |                    * 
  3 -           *              *  <=== (j,i)
    |                 * 
  2 -        *              *               j = 3
    |                                       i = 4
    |              *
  1 -     *              *
    |           * 
  0 +-----|-----|-----*-----|--- X
    0     1     2     3     4

> Take a netCDF fragment as follows:
>     dimensions:
>         i = 10;
>         j = 5;
>     variables:
>         float temperature(j,i);
>             temperature:coordinates = "x,y";
>         float x(j,i);
>         float y(j,i);
> Note that I have used our suggested "coordinates" attribute, rather than
> Jonathan's "associate" one, but this is unimportant to what follows.
> Looking at my plot, the temperature values are on a grid containing 5 rows
> and 10 columns, and are located at the black dots. Each black dot has a location
> in (i,j) space, and also a location in (x,y) space. For example, the lowest
> dot on the page is located at (i=0, j=0), or alternatively (x=5, y=2).
> What you seem to want to do is to associate x with i and y with j.
> In this case, can you actually do it? Can you associate x with either i or j?
> Can you associate y with either i or j?  I would argue that you cannot, because
> x values vary when you vary either i or j, and so do y values.

Precisely!  (Thank you!)

Now, maybe such an example seems too obvious to everyone because the
coordinate variables are named "x" and "y", from which everyone
immediately infers a Cartesian location:

     temperature (2,3) has coordinates X=x(2,3) and Y=y(2,3)

"Sure, I know where that goes!" they would say.  But if I just change
the strings to something less "familiar", what would they say? :

         a = 10;
         b = 5;
         float A(a,b);
             A:coordinates = "abc,def";
         float abc(a,b);
         float def(a,b);

I haven't changed the data or "coordinates" a bit, but there's no
"obvious" inferences this time.  This is what basic application
software would be faced with.

> Your main point appears to me to be that an application must somehow be able
> to gain enough information to be able to produce useful outputs (plots, for
> example). I totally agree with this. I would suggest that to do this, the
> netCDF fragment above should be extended to give the x and y variables
> attributes which provide this data. For example:
>         float x(j,i);
>             x:coordinate_type = "Cartesian X";
>         float y(j,i);
>             y:coordinate_type = "Cartesian Y";
> The strings "Cartesian X" and "Cartesian Y" would be special strings recognised
> by the application. However, please note that these are only suggestions on my
> part. Others might want to use different attributes, or existing ones (the units,
> or long_name, for example). I suspect that substantial effort is needed to generate
> a proposal which formalises such a mechanism. I tend to like some of the suggestions
> which identify groupings of coordinate variables into coordinate systems - we have
> ourselves been thinking about how best to do this. I also think that these are
> somewhat separate issues from the more fundamental identification of coordinate
> variables. Of course, by itself, a proposal which only allows you to identify
> coordinate variables but not interpret them is not much use.

Several useful concepts have been mentioned throughout these discussions,
and some of them are applicable here:

 1. we need to know what the "correct" (ie, intended) coordinate
      system is for the variable (eg, Cartesian, cylindrical, etc.)
      (Not that you could "force" a different interpretation, but
       one "system" ought to "natural".)

 2. to which axis does each "coordinate variable" (however they are
      identified) apply?

 3. if there are several "coordinate variables", which ones make sense 
      to use together as a "system"?

There seems to have been many objections to the use of "referential
attributes" raised to date, but I'm not sure any of them are
legitimate.  You recently (in a recent mail) also raised an objection,
namely, that such an approach could not handle the case where the
dimensionality of the space in which that data resides exceeds the
dimensionality of the array containing the data because "you can't have
more coordinate variables than you have dimensions". I may be wrong,
but I don't think the original idea of referential attributes mandates
such a restriction.  Take, for example, temperature along a particle

       point = 100 ;
       float temp(point) ;
             temp:point="lat,lon" ;
       float lat(point) ;
       float lon(point) ;

Now, that pretty clearly says that "lat" and "lon" contain some sort of
coordinate information for "temp", without the need for a new
attribute.  This is not really *sufficient*, mind you, and it's not as
clear an association as this:

       i = 10 ;
       j = 10 ;
       float temp(j,i) ;
             temp:j="lat" ;
             temp:i="lon" ;
       float lat(j,i) ;
       float lon(j,i) ;

but it does get you part of the way there.  In any case, we still need
to answers to questions 1-3 above.

We might want to exploit the approach used by the Iris Explorer
visualization package.  Each "chunk" of data is contained within a
"lattice".   The data portion of the lattice is an array described by
the number of axes and their lengths.  In the the "coordinate" section
of the lattice, one describes the dimensionality of the space in which
the data resides.  In the most common case, one might have, say, a
3-dimensional array of temperatures which represent a 3-dimensional
spatial distribution with a regular grid.  In this case, all one might
need is 3 vectors containing the coordinates along the three axes.
These are called "perimeter coordinates".

However, there is a great deal of flexibility.  If our temperature data
is 3-D (eg, temp(i,j,k)), but the grid is not regular, you can use
"curvilinear coordinates", where the X, Y, and Z coordinates can vary
from point to point.  This means you are basically storing about 4
times the amount of information, since you now have X(i,j,k), Y(i,j,k),
and Z(i,j,k).

Or, go back to the trajectory example, where your data is stored as a
1-D array, "temp(i*j*k)".  Again, you can use curvilinear coordinates
to store the locations of each point.

Now, Explorer's view is basically Cartesian, so that simplifies
things.  It is up to the user to, for instance, project the data onto a
sphere.  But the separation of the dimensionality of the data from the
dimensionality of the coordinate space is potentially a very useful

Whew, I didn't mean to go on so long about that, but maybe it will
initiate some new line of thinking.  (That's what most of these
exchanges are all about anyhow, right?)

If I have time, I'll try to come up with some mapping of these concepts
including those from GDT, Gary Granger, and questions 1-3 above, to 
some examples, including (time permitting) John Caron's set.


Geophysical Fluid Dynamics Laboratory/NOAA 
Princeton University/Forrestal Campus/Rte. 1 
P.O. Box 308 
Princeton, NJ, USA  08542