# Re: coordinate systems in netcdf (again)

John Caron (caron@ucar.edu)
Fri, 06 Jun 1997 17:26:52 -0600

```This is a multi-part message in MIME format.

--------------2781446B794B
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Attached is a long attempt at defining coordinate systems in a
formalized way, along with proposals for (what else?) netcdf conventions
on coordinate variables, and generalized coordinate systems.

Im a bit rusty at this sort of thing, so Im hoping others might have a
look at it and give me some feedback.  Perhaps someone somewhere else
has made a formalized specification in a more succinct way.  If so,
I'd appreciate a pointer to it.

Anyway, I'm muddling around trying to capture what a coordinate system
is in a precise way, trying to make it as general as possible.  I might
be wrong on some fundamental level, and i'd appreciate understanding
that if you can explain it.  Thanks!

(I couldnt read that attachment, so I'll just resend it here again.
Sorry for the
duplication).

--------------2781446B794B
Content-Type: text/plain; charset=us-ascii; name="coordvar"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline; filename="coordvar"

--------
Dimension
A _dimension_ is a named range of integers = {0,1,..size-1}. A dimension
is completely specified by the pair (name, size). You can substitute {1..size}
in what follows if you prefer 1-based indexing.

--------
Variable

A _variable_ is a function whose domain is D0 x D1 x D2 x .. x Dn = D,
where the Di are the dimensions of the variable, and n is its _rank_.
To include scalar variables of rank 0, we define D0 = {0}.
We can thus write a variable v in functional form as v = f(D) -> R,
where f denotes the function, and R is the range. We will use v as
identical to f in what follows.

In the context of netcdf files, we represent functions as scalar arrays,
and so are limited to directly representing only scalar functions; some further
convention is needed for vector functions.

------------------
Coordinate Variable

A _coordinate variable_ is a variable that assigns physical values to a dimension.
It must be a strictly increasing or decreasing function, and has domain consisting of a
single dimension:  CVi(Di) -> Ri so that CVi is said to be a coordinate variable for
dimension Di.

-----------------
Coordinate System

If V is a vector space, a _coordinate system_ for V is a set of basis vectors for V,
along with units to give each coordinate physical meaning. A _coordinate_ here is a synonym
for basis vector.

Let D be a domain, D = D1 x D2 x .. Dn, and define a set of scaler _coordinate functions_
fi(D) -> Ri.  Let V be the vector space (R1, R2,.. , Rn).  Then the vector function
Fcs = (f1, f2, ..., fn) is said to be a coordinate system for D, Fcs(D) -> V, if Fcs is
invertible. Given the discrete nature of D, Fcs is invertible if it is one-to-one, meaning
Fcs maps each point in D to a unique point in V.

Given a coordinate system Fcs for domain Dc, a variable v with domain Dv, and Dc a
subset of Dv, then Fcs is a coordinate system for v. If Dc = Dv, then Fcs is a _complete_
coordinate system for v.  The value Fcs(di) = vi for a particular value di in the domain
is the _position vector_ for di, and the variable is said to be located at vi for point di,
with respect to the coordinate system Fcs.  (I think "Dc is a subset if Dv" is not quite
right; I probably want to restrict Dc = D1 x D2 x .. Dk to be equal to Dv = D1 x D2 x .. Dn,
with just some dimension Di missing).

A special case of a coordinate system is one where the coordinate functions are
coordinate variables, and so depend on a single domension Di.  Then
Fcs(D1 x D2 x .. x Dn) = (f1(D1), f2(D2), ... fn(Dn)), and Fcs is said to be an
_independent_ coordinate system.

---------------------------
Coordinate Transformations

A coordinate transformation is an invertible mapping M, between two coordinate systems.
Fcs1 and Fcs2:
Fcs1 = M * Fcs2,  M-1 * Fcs1 = Fcs2.
Here * is functional composition, and M-1 indicates the inverse of M.

-------------------------------
Georeferencing Coordinate System

In a georeferencing coordinate system, or GCS for short, there are 3 spatial
dimensions x,y,z, which correspond as much as possible to the directions "east/west",
"north/south" and "up/down", respectively.  A GCS is therefore a function
Fgcs(D) -> (x,y,z)
where x,y,z describe the variable's position or spatial extent in each of the directions.
Note that if describing spatial extent, two values are needed for each direction, eg
x = (xleft,xright) or z = (zhigh,zlow).

============================================
Specifying Coordinate Systems in netcdf files.

We have seen that a general coordinate system is specified by a domain
D = D1 x D2 x .. Dn, a vector space V (and associated physical units for the basis
functions), and an invertible function Fcs(D) -> V.  Netcdf semantics map domains to
named dimensions, and units for coordinates are also very well done.  Variable arrays are
fine for describing single-valued functions.  All that's really missing are vector valued
functions.

Here is a proposal for a netcdf convention for specifying coordinate systems.
The goal is to
1) build from existing practices.
2) keep simple things simple
3) make it flexible enough to handle any coordinate system.

So the proposal is:

1) coordinate variables remain an elegent way to define the coordinate system when
possible.

2) allow the natural extension of coordinate variables to higher dimensions.
Formally:
"A variable with the same name as a dimension is the coordinate variable for that
dimension. If V is a variable with domain D1 x D2 .. Dn = D, let Dc be the subset
of D with coordinate variables defined. Then a coordinate system is defined on Dc
with the function
Fcs(Dc) = (cv1(D1), cv2(D2) ...)
where the cvi's are the defined coordinate variables, and the Di's are each subsets
of D. For any such Dc, Fcs must be invertible."

You notice that coordinate variables are restricted to mapping D (in index space)
to D (in physical coordinate space).  This is a Good thing, and we try hard to
define our dimensions so that we can do exactly that.

3) more generally, allow the specification of coordinate systems using attributes:

"A coordinate system can be defined by an attribute whose name starts with the
string 'coordinates' (case insensitive, optional trailing description) and whose
value is a (comma or blank delimited) list of variable names in the same file that
define the coordinate functions.  The domain Dc of the coordinate system is found
by forming the product of the set of any Di that is contained within the domains of
the coordinate functions. The coordinate system is defined by the function
Fcs(Dc) = (cv1(D1), cv2(D2) ...)
where the cvi's are the named coordinate functions"

This is meant to cover William Weibel's case of:
dimensions:
npoints = 541;
variables:
lon(npoints);
lat(npoints);
geopotential(npoints);
geopotential:coordinates = "lon lat";

and presumably any other coordinate system (?). It seems likely that the case
var(dim, dim) would have to be excluded, ie using the same dimension twice
in a variable declaration (?).

4) allow vector valued coordinates, to cover the famous (gen_time, valid_time)
from NUWG:

"A vector valued coordinate function can be specified by enclosing in
parentheses a list of variables in the same file that define each component of
the coordinate function. Eg:
geopotential:coordinates = "lon lat (gen_time, valid_time)";

I still want to:
5) allow the specification of extents, as well as point positions for a
coordinate function.
6) clarify a number of special things about georeferencing coordinate systems

but I'm running out of gas, and Im not totally sure this whole thing is solid.
So I'll stop and see if anyone can give me feedback one way or the other.

--------------2781446B794B--
```