[netcdfgroup] clarification on _FillValue, missing_value, valid_xxx

Hi guys,

Great class yesterday. I hope to fill out the survey soon.  

To followup on my misunderstanding of _FillValue versus missing_value versus 
valid_xxxxx:  do you think that software packages should be upgraded to handle 
all these possible attributes as you describe on ths page:

http://www.unidata.ucar.edu/software/netcdf/docs/netcdf/Attribute-Conventions.html

It wasn't clear to me, for example, if netcdf4-python handles all cases of 
these attributes?

I looked for the latest CF document and found this:

http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.6/cf-conventions.html

If you do a browser search for "missing_value", it has some text in section 
2.5.1 that looks like it was being edited, but never resolved.  It says, in 
some now crossed-out text, that missing_value is to be deprecated:

------------------------------------------------------------------------------------------------------------------------------------------------

http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.6/cf-conventions.html#missing-data


2.5.1. Missing Data

The NUG conventions ( NUG section 8.1 NUG section 8.1 ) provide the _FillValue, 
missing_value, valid_min, valid_max, and valid_range attributes to indicate 
missing data.

The NUG conventions for missing data changed significantly between version 2.3 
and version 2.4. Since version 2.4 the NUG defines missing data as all values 
outside of the valid_range, and specifies how the valid_range should be defined 
from the _FillValue (which has library specified default values) if it hasn't 
been explicitly specified. If only one missing value is needed for a variable 
then we recommend strongly that this value be specified using the _FillValue 
attribute. Doing this guarantees that the missing value will be recognized by 
generic applications that follow either the before or after version 2.4 
conventions.

The scalar attribute with the name _FillValue and of the same type as its 
variable is recognized by the netCDF library as the value used to pre-fill disk 
space allocated to the variable. This value is considered to be a special value 
that indicates undefined or missing data, and is returned when reading values 
that were not written. The _FillValue should be outside the range specified by 
valid_range (if used) for a variable. The netCDF library defines a default fill 
value for each data type ( NUG section 7.16 NUG section 7.16 ).

The missing_value attribute is considered deprecated by the NUG and we do not 
recommend its use. However for backwards compatibility with COARDS this 
standard continues to recognize the use of the missing_value      attribute to 
indicate undefined or missing data.

The missing values of a variable with scale_factor and/or add_offset attributes 
(see section Section 8.1, “Packed Data”) are interpreted relative to the 
variable's external values , i.e., the values stored in the netCDF file. 
(a.k.a. the packed values, the raw values, the values stored in the netCDF 
file), not the values that result after the scale and offset are applied. 
Applications that process variables that have attributes to indicate both a 
transformation (via a scale and/or offset) and missing values should first 
check that a data value is valid, and then apply the transformation. Note that 
values that are identified as missing should not be transformed. Since the 
missing value is outside the valid range it is possible that applying a 
transformation to it could result in an invalid operation. For example, the 
default _FillValue is very close to the maximum representable value of IEEE 
single precision floats, and multiplying it by 100 produces an "Infinity" 
(using single precision arithmetic).

------------------------------------------------------------------------------------------------------------------------------------------------

In the COARDS document:

http://ferret.wrc.noaa.gov/noaa_coop/coop_cdf_profile.html

it states:

------------------------------------------------------------------------------------------------------------------------------------------------

• _FillValue - If a scalar attribute with this name is defined for a variable 
and is of the same type as the variable, it will be subsequently used as the 
fill value for that variable. The purpose of this attribute is to save the 
applications programmer the work of prefilling the data and also to eliminate 
the duplicate writes that result from netCDF filling in missing data with its 
default fill value, only to be immediately overwritten by the programmer's 
preferred value. This value is considered to be a special value that indicates 
missing data, and is returned when reading values that were not written. The 
missing value should be outside the range specified by valid_range (if used) 
for a variable. It is not necessary to define your own _FillValue attribute for 
a variable if the default fill value for the type of the variable is adequate.

• missing_value - missing_value is a conventional name for a missing value that 
will not be treated in any special way by the library, as the _FillValue 
attribute is. It is also useful when it is necessary to distinguish between two 
kinds of missing values. The netCDF data type of the missing_value attribute 
should match the netCDF data type of the data variable that it describes. In 
cases where the data variable is packed via the scale_value attribute this 
implies that the missing_value flag is likewise packed. The same holds for the 
_FillValue attribute. The NOAA cooperative standard does not endorse any 
particular interpretation of the distinction between missing_value and 
_FillValue.

------------------------------------------------------------------------------------------------------------------------------------------------

Anyway, I just want to make sure that NCL is doing the proper thing with regard 
to missing_value, especially since I didn't realize that it could be an array 
of values, and not just a scalar.

Thanks,

--Mary




  • 2013 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: