Re: coordinate systems in netcdf (again)

John Caron (caron@ucar.edu)
Wed, 11 Jun 1997 15:47:07 -0600

Russ Rew wrote:

> 
> I think you've got rectilinear coordinate systems specified clearly, but
> there may be a problem in trying to use vector spaces and linear algebra
> terminology to define coordinate systems that aren't vector spaces.
> What are the basis vectors for a coordinate system based on (lat, lon,
> height)?  They can't be (1, 0, 0), (0, 1, 0), and (0, 0, 1), in
> (radians, radians, meters) because in a vector space, every vector has a
> unique representation as a linear combination of the basis vectors, but
> (lat, lon, height) and (lat, lon+2*pi, height) represent the same
> element.

I beleive that the suitable modulo functions will make this into a
vector space. A similar question is whether spherical coords are indeed
a basis set for R3, which I believe is true; perhaps someone else can
answer definitively? If not, the whole description of coord systems as
basis vectors in a vector space is pretty specious.

> 
> I would also like to consider the possibility of a more general notion
> of coordinates, for example treating climatology data so that `month'
> could be a dimension with a corresponding coordinate variable in a
> dataset such as:
> 
>      ...
>     dimensions:
>             lat   = 19;
>             lon   = 36;
>             month = 12;
>     variables:
>             float average_temperature(month, lat, lon);
>             // coordinate variables
>             float lat(lat);
>             float lon(lon);
>             // `month' doesn't currently qualify as a coordinate variable,
>             char month(3,month) = "jan","feb","mar",...,"dec";
>      ...
> 
> Here `month' might be considered a _nominal_ coordinate variable, from a
> useful categorization of value types that Harvey Davies once pointed out:

Yes, I see that my use of vector spaces has some problems:

   	1) "Nominal" coordinates do not have addition or scaler
multiplication defined on them, so are not vectors.  Basically I am
confusing "vector" with "tuple". 
      	The most general way I can think of describing coordinates is
that they assign physical meaning to indices of a dimension. It might be
there are no algebraic operations that can  be performed on them.  So
the general definition of coordinate system would simply be a set of
coordinate functions from the index set to the "physical meaning" set,
both represented as tuples. The requirement of one to one mapping would
still hold.

   	2) I am very interested in a special kind of coordinate system,
namely a georeferencing one that maps to physical 3D space. We know
intuitively (?!) that that system is a vector space, and also a metric
space with the usual distance norm.  So there's lots of algebraic
operations defined.

	I want to single out this coordinate system for all kinds of special
treatment. So I would propose a convention to identify these coordinate
functions. eg:

	        dimensions:
                   npoints = 541;
                variables:
                   lon(npoints);
                   lat(npoints);
                   geopotential(npoints);
                        geopotential:geo_coordinates = "lon lat lev";

	This example uses the coordinate reference attribute convention.
There's no obvious exact analogue when using coordinate variables, to
distinguish the georeferencing coordinates from other coordinates.  The
COORDS conventions have you look at the units of the coordinate
variables. The CSM group proposes to use the same method, with some
additional enumerations of possible vertical coordinates.  The idea is
that if the units of the coordinate variable are "udunits convertible"
to "deg_east" or "deg_north" or "km above msl" or whatever, then you can
figure out which coord var is which.

	I myself would prefer an unambiguous convention that just lists the
coordinate functions, even if they are coordinate variables. Thus, a
global attribute
	:geo_coordinates = "lon, lat, lev";
would unambiguously state what the georeferencing coordinate system is.
A field attribute could also be used, and override a global attribute if
it existed.

	3) Time is also a coordinate that requires special treatment, and as
far as I'm concerned should just be named:
	:time_coordinate = "month";


Its interesting to note the similarity here with
http://www.unidata.ucar.edu/software/netcdf/coords/0016.html
from Richard P. Signell (rsignell@crusty.er.usgs.gov) (Tue, 20 Oct 92
13:41:38 EDT) in that he also insists on distinguishing the spatial and
time coordinates.

> As a small step toward moving closer to resolution of extending the
> netCDF conventions for coordinates, I have put together and will
> maintain a Web page linking to netcdfgroup postings relevant to this
> subject:
> 
>    http://www.unidata.ucar.edu/software/netcdf/coords/

this was very useful to have access to these past threads, thanks.

> Current candidates for convention extensions include multidimensional
> coordinate variables and referential attributes.  If neither of these
> turns out to be adequate for solving most problems of interest, I'm not
> sure we would be better off adopting both of them.  It might be better
> to just document them more clearly so that datasets can use them,
> applications can support them, and future data users have a common
> understanding of of what these sorts of conventions mean and when they
> are useful.

Right now I'd say that coordinate variables satisfy the need to "keep
simple things simple" and coordinate reference attributes should be able
to describe any coordinate system (that can be described).

To elucidate that last remark, let me say something about hybrid
coordinates.
In this case, 
	Pressure(x,y,z) = a(z) * pref + b(z) * SurfacePressure(x,y)
Here, (x,y,z) are the indices, pref is a constant, SurfacePressure is a
2D field, and a and b are 1D fields.  I use a(z) + b(z) as a perfectly
good coordinate variable for z, which I call the hybrid coordinate. 
However, if I want the actual pressure (or equivalent height) at a point
(x,y,z), I need to calculate it using the above formula. 

The point is, until we can embed functions (methods) in our netcdf
files, we cant really represent the above formula in the way it is
written.  What we can do now, however, is to compute the field
Pressure(x,y,z) and store it in the netcdf file, and it becomes a
perfectly good coordinate function for the "altitude" coordinate of a
georeferencing coordinate system. So the cost is that we have to store a
3D field, when all the info is really available in 2 1D fields (a and b)
and 1 2D field (SurfacePressure).

Which is just a long example to say that we currently have only arrays
to represent functions.