[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: FW: missing values



> From: address@hidden [mailto:address@hidden]
> Sent: Tuesday, January 25, 2000 8:17 PM
> To: John Caron
> Subject: missing values

Hi Brian,

> I'm thinking that you're a good person to address this question to
> since I assume you've dealt with missing values in the applications
> you're building.  But feel free to pass it on if someone else is a
> better resource.
> 
> I have been reviewing the attribute conventions in the NetCDF User's
> Guide (NUG) in section 8.1.  There seems to be a substantial change
> in how missing values are determined between version-2.3 and
> version-3 of the NUG.  The reason I'm writing is that I would like
> to make sure that I'm interpreting the convention correctly.

The material on missing values was rewritten by Harvey Davies, because
he had found the version-2.3 description to be ambiguous and
inadequate for use in writing a generic package.  He's the developer
of the FAN package
<http://www.unidata.ucar.edu/packages/netcdf/software.html#FAN>, which
he used as a test bed to make sure the new missing-value description
was implementable.  It was not intended to be an incompatible change
to the version 2.3 description, but rather a clarification and further
elaboration of conventions that would be useful in generic netCDF
packages.

> In version-2.3 there were two ways to express missing data.  Either
> use the scalar _FillValue attribute, or use the possibly vector
> valued missing_value attribute.  A generic application should treat
> all the values specified by _FillValue and missing_value as missing.
> The NUG assigns no interpretation to multiple missing values.
> 
> In version-3 things seem to have changed due to interpreting all
> values outside the valid range as missing, and by having the
> _FillValue attribute imply a valid range if one is not specified.
> According to the NUG: "Generic applications should treat values
> outside the valid range as missing."  This implies that missing
> values are no longer a set of discrete values specified by
> _FillValue and missing_value, but now the missing values cover a
> range that is defined to be outside the valid range.
> 
> Is that the correct interpretation of the current convention?  It
> concerns me somewhat because I don't know of any applications that
> implement the convention that way.  In the few applications that
> I've tested, if I define _FillValue = 1.e36, and put a value of
> 2.e36 into the variable, that value won't be recognized as missing.

I think your interpretation of the current convention is correct.

I also think you're right about it not being implemented much; as far
as I know, FAN was the only package that actually implemented the
complete missing value semantics as described for version 3.x.  Part
of the justification for the valid range additions was to permit
reasonable interpretations for fractional values of an index and
in-between values of a coordinate variable to imply interpolation.

I can probably dig up some more notes on justification for the
changes, but it will have to wait until next week.  If you want to ask
Harvey about it, he's also reachable at <address@hidden>.

I also think it would be OK to write a generic package that didn't
follow these conventions exactly, if they seem unreasonable for what
you need.  Since they are not widely observed, you wouldn't be making
your package incompatible with everyone else ...

--Russ

_____________________________________________________________________

Russ Rew                                         UCAR Unidata Program
address@hidden                     http://www.unidata.ucar.edu