Re: missing data and scale/offset standard attributes

Hi Jonathon, Brian:

Thanks for reminding me that CF clarified this issue. My proposal corresponds 
with what CF recommends, at least the first 2 bullets.

The problem is that there are some important datasets that put valid_range values into 
the units of the unpacked data. So the problem is, what do you do with datasets that did 
not follow the advice that "the type of valid_range should match the type of its 
variable" ? Because the interaction with scale/offset was not explicitly mentioned, 
one could argue there are two types of a variable, the packed and the unpacked type.

In any case, we have some "historical uses" that currently the nj22 library handles, more 
or less as i proposed. So im throwing out a trial balloon (bomb?) to see what others think. I would 
at the least like the NUG and "best practices" documentation to get clarified.

_From CF POV, if the propodal was accepted, nothing would need to change except 
to perhaps _require_ that the type of valid_range must match the (packed) type 
of its variable, at the cost of getting misinterpreted.


Brian Eaton wrote:
Hi John,

I believe your items 1. and 2. match the interpretation of the User's
Guide made in the CF conventions.  The reason these restrictions are
important is so that an application can identify missing values *before*
using scale/offset attributes to transform values.  Attempting to transform
missing values could potentially result in an invalid operation.

I not sure what historical uses you're referring to in item 3.  Version 2.3
of the User's Guide says the type of valid_range should match the type of
its variable.  In version 3 of the User's Guide the valid_range is only
allowed to be "wider" than the variable type when that type is byte.  That
was to allow the signedness of the bytes to be identified using valid_range
since the signedness attribute was deprecated.

The primary purpose of valid_range as described in the current version of
the User's Guide is to facilitate identifying missing values.  Allowing
valid_range to be wider than its variable (except in the case where it's
used to indicate the signedness of bytes) requires the application to
transform data before looking for missing values.  I can't see any benefit
to doing this.

Brian



On Tue, Sep 12, 2006 at 01:07:29PM -0600, John Caron wrote:

The users guide is vague on the interaction between missing data and scale/offset standard attributes. Id like to propose adding the following:

1. The _FillValue and missing_value standard attribute values must be in the units of the packed data.

2. The valid_range, valid_max and valid_min standard attribute values should be in the units of the packed data.

3. To accomodate historical uses, if the valid_range, valid_max or valid_min values are wider than the packed data type, then these will be interpreted as being in the units of the unpacked data. Wider means: (byte < short < int < float < double ). Otherwise, they will be interpreted as being in the units of the packed data.


Comments solicited.

==============================================================================
To unsubscribe netcdfgroup, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================

==============================================================================
To unsubscribe netcdfgroup, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================