Re: HDF5 bitfields...

Quincey,

John Caron wrote:
> the scale/offset can be calculated easily from the data itself. often, 
> people want to apply different scale/offset to different parts of the 
> same array, eg vertical levels.

and you replied:
>     Hmm, how would you parameterize this?  Would a user select various parts
> of the dataset's dataspace and specify scale/offset information for them?

When Harvey Davies was here from Australia for a visit about 8 years
ago, we worked out two kinds of scaling for varying packing parameters
along one or more dimensions of a variable: predefined scaling and
adaptive scaling.  

With predefined scaling, the scale and offset values associated with a
packed variable were stored in auxiliary arrays, varying along just
the subset of dimensions used by these arrays.  For example, to store
a packed array of temperatures, one might use

  dimensions:
    time = ...
    lat = ...
    lon = ...
    level = ...
  variables:
    byte temperature(time, level, lon, lat);
    double temperature_scale_factor(level);
    double temperature_add_offset(level);

which would use a possibly different (scale_factor, add_offset) pair
for packing temperatures on each atmospheric level.  This would allow
for greater precision using the same number of bits (or fewer bits for
the same precision) than using one packing parameter pair for all the
data, because this variable tends to have values that depend on level.
It wouldn't work so well with other variables that don't have a
level-dependence.

With adaptive scaling, the optimum scale and offset values were to be
computed by the library for each slab of the variable as it was
written, and stored in automatically-generated associated variables
(or multidimensional attributes).

Although we defined interfaces for these types of scaling, they were
never implemented.  Implementing adaptive scaling seemed pretty
ambitious, and even the predefined scaling would have required
adoption of new conventions for naming associated variables, etc.  And
the proposals actually foundered on inability to agree on all the gory
details, such as determining whether to permit the types of the
scaling parameters to be user-specifiable in adaptive scaling, etc.

--Russ