[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20020712: NetCDF Perl - Testing for floating point fill values



>To: address@hidden
>From: "Matt D'Ortenzio" <address@hidden>
>Subject: Testing for floating point fill values
>Organization: NASA Ames Research Center
>Keywords: 200207112348.g6BNmGa11335

Hi Matt,

> Assume I'm using the netcdf perl interface to read some data out of an
> array of floating point numbers.  I'm going to read the entire array,
> but some of the data in the array hasn't been written yet; that is it
> contains the fill value.  Before I use the data I want to know whether
> I've just read out a fill value.
> 
> Also assume that I don't have control over who is writing the files so I
> can't enforce that the "_FillValue" attribute is set on the variable,
> and I can't be sure what their netcdf package's default FILL_FLOAT value
> is.  
> 
> In this situation what is the best way to test for the fill value?  
> 
> I assume the 'ncdump' utility has to deal with the same question when
> goes to display its 'data' section.
>  
> 
> The man page for ncdump reads:
> ncdump uses `_' to represent data values that are equal to
> the `_FillValue' attribute for a variable, intended to
> represent data that has not yet been written.  If a variable
> has no `_FillValue' attribute, the default fill value for
> the variable type is used if the variable is not of byte
> type.
> 
> 
> How is ncdump written to "know" when the data value is "equal" to the
> fill value when using floating point numbers?  I'm assuming it compares
> the difference between the number and the fill value with some threshold
> value, but what threshold does it use? 
> 
> Any light on the subject would be much appreciated.

In the src/ncdump/vardata.c file, there are a couple of functions that
do the work for float and double variables, printfval() and
printdval().  I've appended the sources for these functions, but they
fill values only need to be within machine-epsilon of the specified
or default fill value to be considered equal, as the result of this
fragment of C code (for type float, in this case):

        if((val > 0) == (fillval > 0) && /* prevents potential overflow */
           (absval(val - fillval) <= absval(float_eps * fillval))) {

where float_eps is the smallest floating point number such that

  1.0f + float_eps > 1.0f

The value of float_eps is either defined in the float.h header file,
or if that does not exist, calculated once in the float_epsilon()
function, also in vardata.c.  The "if" statement above is a little
subtle, because it also has to protect against a possible overflow in
the unlikely event that the fill value is defined at one end of the
floating-point number range (such as 3.402823466E+38F) and the value
it is to be compared against is at the other end of the range (such as
-3.402823466E+38F).  In that case, subtraction would cause overflow,
so that has to be tested for.

There is another subtlety in the choice for the netCDF default fill
values for float and double data.  The defaults, defined in the
netcdf.h include file, were selected so that they could be exactly
represented using only 4 1-bits in the mantissa of a binary
representation:

#define NC_FILL_FLOAT   (9.9692099683868690e+36f) /* near 15 * 2^119 */
#define NC_FILL_DOUBLE  (9.9692099683868690e+36)

meaning binary arithmetic with these fill values should be exact for
some common operations.  

Anyway, I think you should be able to duplicate the above logic in
perl by translating the associated C code in vardata.c into perl.
Please let us know if you still have questions about this ...

--Russ

_____________________________________________________________________

Russ Rew                                         UCAR Unidata Program
address@hidden                     http://www.unidata.ucar.edu



#define absval(x)  ( (x) < 0 ? -(x) : (x) )

/*
 * Output a value of a float variable, except if there is a fill value for
 * the variable and the value is the fill value, print the fill-value string
 * instead.  Floating-point fill values need only be within machine epsilon of
 * defined fill value.
 */
static void
printfval(
    char *sout,                 /* string where output goes */
    const char *fmt,            /* printf format used for value */
    const struct ncvar *varp,   /* variable */
    float val                   /* value */
    )
{
    if(varp->has_fillval) {
        double fillval = varp->fillval;
        if((val > 0) == (fillval > 0) && /* prevents potential overflow */
           (absval(val - fillval) <= absval(float_eps * fillval))) {
            (void) sprintf(sout, FILL_STRING);
            return;
        }
    }
    (void) sprintf(sout, fmt, val);
}


/*
 * Output a value of a double variable, except if there is a fill value for
 * the variable and the value is the fill value, print the fill-value string
 * instead.  Floating-point fill values need only be within machine epsilon of
 * defined fill value.
 */
static void
printdval(
    char *sout,                 /* string where output goes */
    const char *fmt,            /* printf format used for value */
    const struct ncvar *varp,   /* variable */
    double val                  /* value */
    )
{
    if(varp->has_fillval) {
        double fillval = varp->fillval;
        if((val > 0) == (fillval > 0) && /* prevents potential overflow */
           (absval(val - fillval) <= absval(double_eps * fillval))) {
            (void) sprintf(sout, FILL_STRING);
            return;
        }
    }
    (void) sprintf(sout, fmt, val);
}