« CF Ragged Arrays | Main | Making the Case For... »

17 October 2008

I'm gathering examples of coordinate mapping functions.

Given a data variable V, with dimensions {dataDim_i} i = 0,n-1

V has a coordinate system, which is a set of coordinate variables = {CV_j} j=0,m-1

A coordinate mapping function CMF : {dataDim_i} ? {coordValue_j} is a map from the index space of V to a vector of coordinate values, one for each coordinate variable in the coordinate system.

The common case is that each coordinate variable has as dimensions a subset of the dimensions of V. This all you have to do is plug in the data index values into the corresponding dimensions of the coordinate variables. Note that a scalar coordinate variable always returns its scalar value.

1. Compression By Gathering

CF compression by gathering can be thought of as a coordinate mapping function that maps the compressed index value to 2 or more coordinate indexes, using a gatherVariable and the usual stride arithmetic. For example, for a 2D compression = (dim1, dim0):

  fullIndex = gatherVariable(compressedIndex)
  coord1 = fullIndex / dim1.length
  coord0 = fullIndex % dim1.length

where % is modulo function. The gatherVariable is the rgrid variable of the CF examples. One can generalize to multiple dimensions, although you are limited in the current spec to a 32 bit compressedIndex value.

2. Ragged Array

A Ragged Array mapping can also be thought of as a coordinate mapping function that maps the ragged index value to 2 coordinate indexes, using a startRow or numRow variable to record the size of each row. For example, for a 2D ragged array = (dim1, dim0):

  find i such that startRow(i) <= raggedIndex < startRow(i+1);
  coord1 = i  coord0 = raggedIndex - startRow(i)

I haven't yet worked out how (or if) to generalize to more than 2 dimensions. Note that this is similar to the contiguous list proposal, which stored both the start and number of each row.

3. Index Join

An Index Join is a new concept. A table is a collection of variables with the same outer dimension.An index join connects two tables using a variable in one table that holds dimension indices into the second table. When one of the tables holds coordinate variables for data in the other table, this join becomes the basis for a coordinate mapping function as follows. Suppose the data variable lives in table A and the coordinate lives in table B. Then:

indexB = joinVariable(indexA);
coordVal = coordVariable( indexB);

where the joinVariable in in Table A and holds indices in Table B. If the joinVariable in in Table B and holds indices in Table A, you have the harder problem of inverting the mapping, that is, one must search through the joinVariable:

find indexB such that joinVariable(indexB)= indexA
coordVal = coordVariable( indexB);

There may be none, or more than one such indexB for a given indexA. This is used for nested tables, with the joinVariable in the child table, pointing to the parent table. Previously called the parent index method.

4. HDF-EOS Index Maps

HDF calls these dimension maps, see section 5.1.4 of the HDF-EOS User Guide, Volume 1. The idea is to use fewer geo-location points then data points, and interpolate between them.

If you have more examples that don't fit these, send em my way.

Posted by $entry.creator.screenName