Re: Coordinate Systems

John Caron (caron@ucar.edu)
Fri, 01 Aug 1997 11:37:57 -0600

Hi Jonathan,

thanks for your reply, here are a few quick reactions to your email:


> Multiple values:
> 
> I feel that the various multiply valued coordinates you have referred to are of
> several distinct types. I would argue that the "representative" or "midpoint"
> value is the principal coordinate value. This one should always exist, and it
> is this which must be monotonic (or at least ordered) if it is one-dimensional.
> 
> GDT distinguish three kinds of subsidiary coordinates: (1) Boundary (section
> 21). We group the upper and lower boundaries into one variable for tidiness and
> ease of access. (2) Component (section 18). These are for cases where the
> coordinate values are tuples, such as for the hybrid pressure-sigma vertical
> coordinate. However, an ordinary principal coordinate value must still be
> provided, for ordering the axis. (3) Associate (section 19). Associate values
> are additional information, or extra ways of labelling the points, such as your
> "lev_label".
> 
> I think these are all truly different. Moreover, component and associated
> coordinate values can have boundaries, and associated coordinates can have
> components. For this reason, rather than
> 
>   :coordinates = "lon lat (lev_upper lev_lower lev_midpoint lev_label)";
> 
> I think that it would be better to specify
> 
>   :coordinates="lon lat lev";
> 
> and provide the boundary, component and associate coordinates by attaching them
> to lev, the representative or midpoint value. That makes for a simpler and
> clearer definition of the coordinate system, and it shows that the other
> information really is subsidiary to lev.

I think we are converging on a listing of the various "types" (meanings)
we want to have for coordinates:
	1) "point", "representative point", or "principal value". 
	2) "boundry" or "range"
	3) "label", "nominal" or "associate"

Independently is the possibility that a coordinate value is specified by
a tuple, eg (year,month,day,daysec): in this case we can represent such
a value along a single axis.

More difficult examples are 1) hybrid coords and 2) (generate_time,
reference_time): are these "values on a single axis"? The hybrid coords
are really parameters to a function that map level indices to a single
axis, and (generate_time, reference_time) i think are really two
different axis. Of course both can be considered "labels", with the
ordering provided by the index or a "principal value", although this
doesnt capture everything we want.

I see the value of requiring a "principal value"; then the embedding
into Rn is straightforward, and everything else hangs off of it.
Probably if we removed the requirement that it be 1-D, and clarify the
meaning and names of the various coordinates, It could be a workable
implementation. 

I need to think about some of the unresolved issues. How would you
implement example 11?

> 
> Implicit coordinate system:
> 
> For the case where all the coordinate variables are one-dimensional, I am not
> convinced that it is worth the complexity of declaring the coordinate system
> explicitly. If we have a variable
> 
>   float temperature(lev,lat,lon);
> 
> I think it is fine to rely on the implicit convention. In fact there are two
> implicit coordinate systems the application could use, either plain indices for
> each of the dimensions, or the one-dimensional coordinate variables lev, lat,
> lon. The application must be *able* to handle these implicit conventions, and
> all old data will necessarily rely on them. So there is not much to be gained,
> perhaps, by explicitly declaring the coordinate system in such a case.

Yes, there's no requirement to declare the implicit coord system,
although my proposal recommends that you do so when there is more than
one.

> 
> Multidimensional coordinates:
> 
> By this I mean coordinate variables depending on more than one dimension - I do
> not regard boundaries for one-dimensional coordinates as examples of
> multidimensional coordinates. It is in this case that some kind of explicit
> declaration is needed. 

Yes dimensionality refers to the domain, not the range. However,
explicit declaration is not really related to the dimensionality of the
coord functions. Its necessary when you aren't using coordinate
variables, and useful arguably even when you are.

> You suggest that coordinate systems could be declared
> globally, with individual data variables able to override them. I would suggest
> that it would be more convenient for individual data variables always to have
> the declaration, for two reasons:
> 
> (1) If a variable declares a particular coordinate system, it can be assumed
> that the coordinates listed are appropriate for this variable (although you
> might want to check this). It is more work to search through all the global
> declarations to work out which ones apply to a particular variable.

true

> 
> (2) Various coordinate systems (groups of coordinate variables) might be felt
> to be equivalent, even if they involved different coordinate variables. For
> example, it is common to use a B-grid or C-grid in climate models. In this
> case, the temperatures and velocities will have different lat and lon
> coordinate vectors. Or you might have fields of different spatial resolution in
> the same file. All the data variables in the file would have something you
> would feel to be a "latlon" system, but these systems are different in terms of
> coordinate variables.
> 
> Although I appreciate what is being suggested here, I am not clear what we gain
> from these declarations. Suppose you have declared a "latlon" and a
> "stereo_projection" coordinate system. Presumably you will still have to tell
> the application that reads the file which system you want to use. This means
> that the keywords "latlon" and "stereo_projection" will have to be be
> standardised if the files are going to be portable. The application which
> *generates* the file will have to know what a latlon coordinate system is
> i.e. it consists of a latitude and a longitude coordinate variable, so that it
> can encode the coordinate system in the file. Why is that better than
> programming the application which *uses* the file to know that if asked for a
> "latlon" system it must find latitude and longitude coordinate variables? If
> it knows this, the coordinate system does not have to be declared in the file.

In my proposal, "latlon" is not a keyword, though "geodetic" is. I
assume that a "good" application would allow the user to choose to see
the data in any of the coordinate systems that the maker of the file so
kindly specified. 

My intention is certainly to allow one or more coordinate systems to be
specified, although section one doesnt offer anything more than
generalized coordinate variables. In order to get earth referencing, for
example so the application could add a map underlay, we need more
semantics which my section 2 on geodetic systems makes a beginning at
specifying. I am not leaning towards a keyword listing of "latlon",
"stereo-projection", etc, but rather some way of adding methods that
actually does the transformation to lat,lon and also to altitude or
pressure.

With regard to global/variable scope, my intention is to let the user
decide how to make their files as readable as possible.

Regards,
John.