Hi again- It seems that Brian and I are the sole participants in this exchange, so perhaps we can summarize this issue and get out of everyone's mailbox :-) In fact, there is, in the end, not that much difference in the proposed mechanisms for specifying what I will call "curvilinear coordinates" (a term stolen from the Iris Explorer visualization application to indicate that the position of data points is variable and so must be specified for each point). The original proposal from John Caron looked like this: > dimensions: > lat = 4 ; > lon = 3 ; > variables: > float lat(lat,lon) ; > float lon(lat,lon) ; > float mydata(lat,lon) ; This is actually a fairly elegant way to specify a 4x3 grid of lat-lon locations for the gridpoints, assuming that the downstream application is smart enough to figure out that the 3 lon's at each lat (and the 4 lat's at each lon) for the 4x3 variable "mydata" are given in the variables "lat" and "lon". The alternate approach, using referential attributes, looks like this (with a alight correction to the earlier specification in order to make the shape the same): > dimensions: > nj = 4 ; > ni = 3 ; > variables: > float mydata(nj,ni) ; > mydata:nj = "latarr"; > mydata:ni = "lonarr"; > float latarr(ni,nj) ; > float lonarr(ni,nj) ; In both cases, an application first sees that "mydata" is lat x lon, or 4x3. Then, in the first case, it takes the name of "mydata"'s dims and looks for variables by those names to get the coordinates and, upon finding that they are 2D, must infer how the lat's and lon's vary from point-to-point. In the second case, the application takes the name of "mydata"'s dims and looks for *attributes* by those names from which to obtain the names of the variables from which to get the coordinates. Upon finding that they are 2D, it must also infer how the lat's and lon's vary from point-to-point. In either case, the handling of coordinates by downstream applications must be made smarter *in exactly the same way*. However, in the first case, they also have to rewritten to undo existing assumptions about "coordinate variables", or they will choke. Despite the extra level of indirection in the second approach, I would simply feel more comfortable limiting the the changes to an enhancement the application's capabilities, without the need to also "fix" them where they are not heretofore broken. Going back to the very original question from Gary Strand (back in March!), I'd first ask whether you need to use any existing software for handling/visualizing netCDF data. If so, I would recommend the referential attributes approach, or much of that software won't work on your data. If not, then the first approach is fine. But in either case, I'm not aware of any existing software that will be able to actually exploit the "curvilinear coordinates" you are describing. (An upcoming release of FERRET *will* provide a mechanism whereby such a connection can be made, though it will still choke on the 2D coordinate variables in the first method.) I hope this issue is pretty well exhausted by now. However, it appears that there may still be a bit of confusion over how "referential attributes" work, which (it seems to me) is common whenever indirection is involved. Because I believe that referential attributes offer so much potential for enhancing the value of metadata, let me address a couple of these issues in the hope that others might also appreciate their value... > > The point here was that referential attributes provide a *mechanism* > > for associating variables. > > My point was that u, p and vpt are already associated by virtue of having > identical dimension lists. The referential attribute provides no new > information in example 1. > > I agree that referential attributes provide a mechanism for associating > variables, but they do not describe the relationships between variables so > associated. In the example from the Unidata web page: float u(record, z, x, y); u:z = "vpt, p"; float p(record, z, x, y); float vpt(record, z, x, y); the fact that "u", "p", and "vpt" are defined on the same grid doesn't tell you anything more than that. But the referential attributes do, indeed, provide additional information! They tell you that coordinates for points along the z-dimension for variable "u" can be found in variables "vpt" or "p". Suppose there were also a variable float wspeed(record, z, x, y); The referential attributes assign special significance to the variables "vpt" and "p" that is not true of "wspeed". Thus, there *is* additional information being provided! > > Your objection to the lack of an actual "z" coordinate is > > solved by specifying the attribute as 'u:z = "vpt, p, alt" ', where a > > variable "alt" could be provided to document the actual altitude of the > > points. The variable "alt" would carry along its own metadata > > documenting how it was calculated (eg, hypsometric equation, virtual > > temperature correction, etc...). > > In this example I believe that the existing convention for coordinate > variables is the method that should be used to define z. You are missing the very need for making reference to a "z" variable. With the exception of z-coordinate models, the altitude of a gridpoint varies in both space and time. That is, there is no pre-defined z-coordinate for each layer of the model; you can't, a priori, specify a vector of heights (eg, 1km, 2km, 3km,...) because the altitude of a gridpoint is different at every lat-lon and timestep. (Think of a sigma coordinate model, where the altitude of a point depends on the terrain height, surface pressure, and temperature and humidity profiles.) > > In all cases, one still has to know how to > > interpret the referenced variable (a bit complicated, but one could, > > conceivably, infer a relationship base on the shape of the referenced > > variable). There are no established conventions here - we're just > > suggesting some alternatives that fit within the conventions that have > > already been established WRT netCDF. > > Exactly. And therein lies the problem. The conventions on these > relationships that would work for describing 2D coordinate variables won't > be the same as the one that would be required to describe what is meant by > e.g., 'u:z = "vpt, p, alt"' I disagree - the bulk of the new capabilities required of applications is exactly the same, The only difference is how they identify the variable(s) from which to obtain the coordinates...but I've already discussed this above... > > I see where you're going, but this is one solution that does *not* fit > > within the conventions that have already been established, because > > there is a long-standing convention that variables with the same name > > as a dimension are "coordinate variables", and should be one- > > dimensional vectors containing the coordinates of the points along that > > dimension. > > The fact that netCDF's current convention for coordinate variables only > applies to 1D variables makes it easy for an application to recognize that > e.g., lon(lon,lat) is not a "coordinate variable" in the sense of the > current convention. You definitely have a point here. Unfortunately, I think the convention about variables with the same name as a dimension is so well established that many applications *expect* that such a variable will be 1-D, and will choke if it turns out to be otherwise. (I know this is the case for FERRET.) > > This is not to say that you couldn't write your own > > application to use such data - all the information you need is, indeed, > > present. But you will have some difficulty getting any existing > > software to read it. > > Any application that is looking for coordinate variables expects the > corresponding data to be on a rectilinear grid. If I need 2D coordinates > to describe the grid then it is not rectilinear and a package designed > to plot rectilinear data shouldn't be expected to work. That's true, but everything we've been talking about represents an extension to existing capabilities. Applications would need to be upgraded to handle the curvilinear coordinates, or they will simply fall back to treating the coordinates as "1", "2", etc. The problem is that, if you started using the proposed approach, they will probably not even get that far. > OK. Here is the unabridged version of Example 6. > > dimensions: > lat = 4 ; > lon = 3 ; > variables: > float lat(lat,lon) ; > lat:long_name = "Latitude" ; > lat:units = "degrees_north" ; > float lon(lat,lon) ; > lon:long_name = "Longitude" ; > lon:units = "degrees_east" ; > float mydata(lat,lon) ; > mydata:long_name = "data description" ; > mydata:units = "whatever" ; > > We use the units attributes of lat and lon to tell us that we are working > on a lat-lon grid (as in COARDS). That lat and lon are 2D arrays tells us > that the grid is not rectilinear. > > By analogy to the current convention for coordinate variables example 6 > would be interpreted as, e.g., the lat-lon coordinates of mydata(1,1) > are lat(1,1) and lon(1,1). As I said, this is a perfectly logical approach, even elegant. The only problems are (a) getting applications to interpret it (which is also true of the referential attributes approach), and (b) existing applications are likely to choke on the "2D coordinate variables". I'd still be interested in hearing from others about this...???? Cheers- John P. Sheldon (jps@gfdl.gov) Geophysical Fluid Dynamics Laboratory/NOAA Princeton University/Forrestal Campus/Rte. 1 P.O. Box 308 Princeton, NJ, USA 08542 (609) 987-5053 office (609) 987-5063 fax