# Main | CF Ragged Arrays »

CF Section 8.2 compression by gathering has this example (with my mods to make it more readable):

`dimensions:  lat=73;  lon=96;  cindex=2381;  depth=4;variables:  float data(depth, cindex);  int cindex(cindex);    cindex:compress="lat lon";  float depth(depth);  float lat(lat);  float lon(lon);data:  landpoint=363, 364, 365, ...;`
"Since landpoint(0)=363, for instance, we know that data(*,0) maps on to point 363 of the original data with dimensions (lat,lon). This corresponds to indices (3,75), i.e., 363 = 3*96 + 75."

Lets call cindex a compression dimension; it is identified  by the compress attribute, which has a list of the dimensions gathered. Equivalently we could store the indices separately, eg use cindex(condex,2), where 0= lat index, 1=lon index, instead of using stride arithmetic (363 = 3*96 + 75) .

Note that the lat, lon coordinates are orthogonal so this is a 2D regular (rectified) grid. The compression just saves storage space.

Note probably worth adding the coordinate attribute to the data variables for clarity and completeness, eg:

`  float data(depth, cindex);`
`    data:coordinates = "lon lat" ;`

A compression dimension logically expands to its list of compressed dimensions, eg:

`  float data(depth, lat, lon);`

with missing values at the places not stored. Thus, we keep the rule "coordinate dims must be subset of data dims".

However, CF Section 5.3  Reduced Horizontal Grid has a different example of Compression by Gathering:

"A "reduced" longitude-latitude grid is one in which the points are arranged along constant latitude lines with the number of points on a latitude line decreasing toward the poles. Storing this type of gridded data in two-dimensional arrays wastes space, and results in the presence of missing values in the 2D coordinate variables."

`dimensions:  londim = 128 ;  latdim = 64 ;  cindex= 6144 ;variables:  float data(cindex) ;    data:coordinates = "lon lat" ;  int cindex(cindex);    cindex:compress = "latdim londim";  float lon(cindex) ;  float lat(cindex) ;`
"PS(n) is associated with the coordinate values lon(n), lat(n). Compressed grid index (n) would be assigned to 2D index (j,i) (C index conventions) where j = rgrid(n) / 128 and i = rgrid(n) - 128*j".

If we do logical expansion of the compression dimension:

`  float data(latdim, londim) ;    data:coordinates = "lon lat" ;  int cindex(cindex);    cindex:compress = "latdim londim";  float lon(latdim, londim) ;  float lat(latdim, londim) ;`

So  CF is trying to deal with 2D lat lon coordinates here. Actually i think the common case for reduced grids is 1D latitude coordinates and variable length 2D longitide coordinates  (eg reduced Gaussian Grids).  So probably the example should be

`dimensions:  londim = 128 ;  latdim = 64 ;  cindex= 6144 ;variables:  float data(cindex) ;    data:coordinates = "lon lat" ;  int cindex(cindex);    cindex:compress = "latdim londim";  float lon(cindex) ;  float lat(latdim) ;`

which is logically expanded to

`  float data(latdim, londim) ;    data:coordinates = "lon lat" ;  int cindex(cindex);    cindex:compress = "latdim londim";  float lon(latdim, londim) ;  float lat(latdim) ;`

The main problem with this is that we've really got a "ragged array", not a 2D rectified grid with missing data.

Since this blog post is getting long, let me continue in part 2.

• HTML Syntax: Allowed
##### Unidata Developer's Blog
A weblog about software development by Unidata developers*
##### Unidata Developer's Blog
A weblog about software development by Unidata developers*

Welcome

FAQs

News@Unidata blog

##### Take a poll!

What if we had an ongoing user poll in here?

##### Browse by Topic
Sun Mon Tue Wed Thu Fri Sat « February 2019 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Today