missing data and scale/offset standard attributes

John Caron caron at unidata.ucar.edu
Tue Sep 12 17:18:37 MDT 2006


Hi Jonathon, Brian:

Thanks for reminding me that CF clarified this issue. My proposal corresponds with what CF recommends, at least the first 2 bullets.

The problem is that there are some important datasets that put valid_range values into the units of the unpacked data. So the problem is, what do you do with datasets that did not follow the advice that "the type of valid_range should match the type of its variable" ? Because the interaction with scale/offset was not explicitly mentioned, one could argue there are two types of a variable, the packed and the unpacked type.

In any case, we have some "historical uses" that currently the nj22 library handles, more or less as i proposed. So im throwing out a trial balloon (bomb?) to see what others think. I would at the least like the NUG and "best practices" documentation to get clarified.

_From CF POV, if the propodal was accepted, nothing would need to change except to perhaps _require_ that the type of valid_range must match the (packed) type of its variable, at the cost of getting misinterpreted.


Brian Eaton wrote:
> Hi John,
> 
> I believe your items 1. and 2. match the interpretation of the User's
> Guide made in the CF conventions.  The reason these restrictions are
> important is so that an application can identify missing values *before*
> using scale/offset attributes to transform values.  Attempting to transform
> missing values could potentially result in an invalid operation.
> 
> I not sure what historical uses you're referring to in item 3.  Version 2.3
> of the User's Guide says the type of valid_range should match the type of
> its variable.  In version 3 of the User's Guide the valid_range is only
> allowed to be "wider" than the variable type when that type is byte.  That
> was to allow the signedness of the bytes to be identified using valid_range
> since the signedness attribute was deprecated.
> 
> The primary purpose of valid_range as described in the current version of
> the User's Guide is to facilitate identifying missing values.  Allowing
> valid_range to be wider than its variable (except in the case where it's
> used to indicate the signedness of bytes) requires the application to
> transform data before looking for missing values.  I can't see any benefit
> to doing this.
> 
> Brian
> 
> 
> 
> On Tue, Sep 12, 2006 at 01:07:29PM -0600, John Caron wrote:
> 
>>The users guide is vague on the interaction between missing data and 
>>scale/offset standard attributes. Id like to propose adding the following:
>>
>>1. The _FillValue and missing_value standard attribute values must be in 
>>the units of the packed data.
>>
>>2. The valid_range, valid_max and valid_min standard attribute values 
>>should be in the units of the packed data.
>>
>>3. To accomodate historical uses, if the valid_range, valid_max or 
>>valid_min values are wider than the packed data type, then these will be 
>>interpreted as being in the units of the unpacked data. Wider means: (byte 
>>< short < int < float < double ). Otherwise, they will be interpreted as 
>>being in the units of the packed data.
>>
>>
>>Comments solicited.
>>
>>==============================================================================
>>To unsubscribe netcdfgroup, visit:
>>http://www.unidata.ucar.edu/mailing-list-delete-form.html
>>==============================================================================

==============================================================================
To unsubscribe netcdfgroup, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================



More information about the netcdfgroup mailing list