[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 19980928: Problem with floats in netCDF



>cc: "Tiam Yew (SIDSBV) Wee" <address@hidden>,
>cc: "Umit (SIDSBV) Tacay" <address@hidden>
>From: "E.Muter" <address@hidden>
>Subject: Problem with floats in netCDF
>Organization: Shell International Deepwater Services b.v.
>Keywords: 199809281523.JAA20929 netCDF float

Hi Erik,

> I have encountered a problem, while accessing a simple netCDF dataset. I
> have made 2 CDL files: z21.cdl and z22.cdl. Both contain two time arays:
> time1 and time2. The only difference is that the time arrays in z21.cdl
> are of type float and the arrays in z22.cdl are of types double. I
> converted both files with the ncgen utility. This are the CDL files:
> 
> netcdf z21 {
> dimensions:
>       timelen = UNLIMITED ; // (20 currently)
> variables:
>       float time1(timelen) ;
>               time1:units      = "Julian date since 1940.01.01:00:00:00" ;
>               time1:long_name  = "This is the time1" ;
>               time1:_FillValue = -999999.9F;
>       float time2(timelen) ;
>               time2:units      = "Julian date since 1940.01.01:00:00:00" ;
>               time2:long_name  = "This is the time2" ;
>               time2:_FillValue = -999999.9F;
> data:
> 
>  time1 = 16019.65, 16019.66, 16019.67, 16019.67, 16019.68, 16019.69, 
>     16019.7, 16019.71, 16019.72, 16019.73, 16019.74, 16019.75, 16019.76, 
>     16019.76, 16019.77, 16019.78, 16019.79, 16019.8, 16019.81, 16019.82
> ;
> 
>  time2 = 17019.65, 17019.66, 17019.67, 17019.67, 17019.68, 17019.69, 
>     17019.7, 17019.71, 17019.72, 17019.73, 17019.74, 17019.75, 17019.76
> ;
> }
> 
> netcdf z21 {
> dimensions:
>       timelen = UNLIMITED ; // (20 currently)
> variables:
>       double time1(timelen) ;
>               time1:units      = "Julian date since 1940.01.01:00:00:00" ;
>               time1:long_name  = "This is the time1" ;
>               time1:_FillValue = -999999.9d;
>       double time2(timelen) ;
>               time2:units      = "Julian date since 1940.01.01:00:00:00" ;
>               time2:long_name  = "This is the time2" ;
>               time2:_FillValue = -999999.9d;
> data:
> 
>  time1 = 16019.65, 16019.66, 16019.67, 16019.67, 16019.68, 16019.69, 
>     16019.7, 16019.71, 16019.72, 16019.73, 16019.74, 16019.75, 16019.76, 
>     16019.76, 16019.77, 16019.78, 16019.79, 16019.8, 16019.81, 16019.82
> ;
> 
>  time2 = 17019.65, 17019.66, 17019.67, 17019.67, 17019.68, 17019.69, 
>     17019.7, 17019.71, 17019.72, 17019.73, 17019.74, 17019.75, 17019.76
> ;
> }
> 
> I wrote my own C-program that print the values of both arrays from the
> files. This is my C-code:
> 
> #include <stdio.h>
> #include <netcdf.h>
> 
> main(int argc, char *argv[])
> {
>    int           lv_ncid;
>    int           lv_mid;
>    float        *lp_time1_fl = NULL;
>    float        *lp_time2_fl = NULL;
>    double       *lp_time1_db = NULL;
>    double       *lp_time2_db = NULL;
>    int           lv_status;
>    int           lv_i;
> 
>    /***************/
>    /* Open z21.nc */
>    /***************/
>    printf("***** ./z21.nc *****\n");
>    lv_status = nc_open("./z21.nc", NC_NOWRITE, &lv_ncid);
> 
>    /* Get varid time1 */
>    nc_inq_varid(lv_ncid, "time1", &lv_mid);
> 
>    /* Allocate mem for time1. */
>    lp_time1_fl = (float*)malloc((size_t)(20 * sizeof(float)));
> 
>    /* Retrieve time1. */
>    nc_get_var_float(lv_ncid, lv_mid, lp_time1_fl);
> 
>    /* Print time1. */
>    for (lv_i = 0; lv_i < 20; lv_i++)
>    {
>       printf("lp_time1_fl[%d]: %f\n",
>              lv_i,
>              lp_time1_fl[lv_i]);
>    }
> 
>    printf("\n");
> 
>    /* Get varid time2 */
>    nc_inq_varid(lv_ncid, "time2", &lv_mid);
> 
>    /* Allocate mem for time2. */
>    lp_time2_fl = (float*)malloc((size_t)(20 * sizeof(float)));
> 
>    /* Retrieve time2. */
>    nc_get_var_float(lv_ncid, lv_mid, lp_time2_fl);
> 
>    /* Print time2. */
>    for (lv_i = 0; lv_i < 20; lv_i++)
>    {
>       printf("lp_time2_fl[%d]: %f\n",
>              lv_i,
>              lp_time2_fl[lv_i]);
>    }
> 
>    /* close the file. */
>    nc_close(lv_ncid);
> 
>    free(lp_time1_fl);
>    free(lp_time2_fl);
> 
>    /***************/
>    /* Open z21.nc */
>    /***************/
>    printf("***** ./z22.nc *****\n");
>    lv_status = nc_open("./z22.nc", NC_NOWRITE, &lv_ncid);
> 
>    /* Get varid time1 */
>    nc_inq_varid(lv_ncid, "time1", &lv_mid);
> 
>    /* Allocate mem for time1. */
>    lp_time1_db = (double*)malloc((size_t)(20 * sizeof(double)));
> 
>    /* Retrieve time1. */
>    nc_get_var_double(lv_ncid, lv_mid, lp_time1_db);
> 
>    /* Print time1. */
>    for (lv_i = 0; lv_i < 20; lv_i++)
>    {
>       printf("lp_time1_fl[%d]: %f\n",
>               lv_i,
>               lp_time1_db[lv_i]);
>    }
> 
>    printf("\n");
> 
>    /* Get varid time2 */
>    nc_inq_varid(lv_ncid, "time2", &lv_mid);
> 
>    /* Allocate mem for time2. */
>    lp_time2_db = (double*)malloc((size_t)(20 * sizeof(double)));
> 
>    /* Retrieve time2. */
>    nc_get_var_double(lv_ncid, lv_mid, lp_time2_db);
> 
>    /* Print time2. */
>    for (lv_i = 0; lv_i < 20; lv_i++)
>    {
>       printf("lp_time2_fl[%d]: %f\n",
>              lv_i,
>              lp_time2_db[lv_i]);
>    }
> 
>    /* close the file. */
>    nc_close(lv_ncid);
> 
>    free(lp_time1_db);
>    free(lp_time2_db);
> 
> }
> 
> This is the output of my c program:
> 
> ***** ./z21.nc *****
> lp_time1_fl[0]: 16019.650391
> lp_time1_fl[1]: 16019.660156
> lp_time1_fl[2]: 16019.669922
 ...
> lp_time2_fl[19]: -999999.875000
> ***** ./z22.nc *****
> lp_time1_fl[0]: 16019.650000
> lp_time1_fl[1]: 16019.660000
> lp_time1_fl[2]: 16019.670000
 ...
> lp_time2_fl[19]: -999999.900000
> 
> 
> As you can see, all values of the z21.nc file are messed up (that is,
> the decimal part), whereas the values from z22.nc are OK.

I think you are just seeing the effect of printing out C floats with too
much precision.  A netCDF float is represented externally as a 32-bit
IEEE floating point number, which has only 24 bits of precision.  That
corresponds to only about 7 significant digits of precision, so for
example, the numbers 16019.650391 and 16019.650000 are represented by
the same 32-bit IEEE floating point representation.

This has nothing to do with netCDF; you can see the same "problem" if
you print the number 10000.11 as both a floating-point number and a
double-precision float.  For example,

    ff = 10000.110352
    dd = 10000.110000

will be the output of the following small C program:

    #include <stdio.h>

    void
    main() {
        float ff = 10000.11;
        double dd = 10000.11;
        printf("ff = %f\n", ff);
        printf("dd = %f\n", dd);
    }

> The ncdump routine shows no problems b.t.w.:
 ...

That's because the ncdump program doesn't use "%f" to print out data of
type float, but instead figures out a format that represents 7
significant digits, for example "%5.2f" or "%8.7g".  If you use either
of these two formats in the above program instead of "%f" to print out
the variable ff, the result will be "10000.11" instead of
"10000.110352".

> I have no clue what might be wrong with the above. Can you please help
> me out ?
>
> I compiled everything on an IBM RS/6000 running AIX 4.1 with the cc
> compiler.

I don't think the system or compiler matters in this case; it's just
unfortunate that the printf "%f" format sometimes uses excessive
precision ...

--Russ

_____________________________________________________________________

Russ Rew                                         UCAR Unidata Program
address@hidden                     http://www.unidata.ucar.edu