Re: [netcdfgroup] Odd failure reading compressed 32-bit floats with nc_get_var_double()?

  • To: David Pierce <dpierce@xxxxxxxx>
  • Subject: Re: [netcdfgroup] Odd failure reading compressed 32-bit floats with nc_get_var_double()?
  • From: Wei-Keng Liao <wkliao@xxxxxxxxxxxxxxxx>
  • Date: Wed, 9 Sep 2020 05:24:12 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=northwestern.edu; dmarc=pass action=none header.from=northwestern.edu; dkim=pass header.d=northwestern.edu; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DNeYFMb5CIjBdY7chUghBZOSZqsW+Y0tkUCDm2thIy8=; b=XCi0qaK521qLlE/CXk+8DMf+4+t4KfaQgUmwCcKtD6JNPHEZZDOgm7Q84v+aHPF5cDorfgx/8KCeAPC13JZ3TdKCnRbEEpjZGZw4LwAI235pTunnrMJGecvNN6hS0R0ccUR4aagtrLt2xvaQqOOHbpCA946iT94NeljST4xPUkbuhcbQCgxGCBTQyx7fKbSr3hNj3dP7MUKqENrY8iAi5bX02Q/Hx4/1c0Jkgp+4NjZQ1b1S3y2XBlPEs/Ddcl3EVl/5Z3q4cpc2UYiC+6DR9HpJ83Kv1g5NafR1Wzho/bxY9CJESus8lb6ZHWHHaulvgYAEBVIYscMjMQgX1HxYlQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mRXL/7brKfIgkwRV4HOlHLoKgf0wVXQJR3ikTJ0q50amIPChIqmQa8dB0rXEDJQdD4Vvuk7plVxFJ5D7LiqxSEfJ1Ho1afRnS+dThzvtdLd1EZLk3yVm0pR5N8/picmV1AQTCnkgA3/kcGaBqF1zQfKdxUhjfQ4IM14cL/rqau+WXKpqHuODJjU6LFvTJqqGyFEvIOYPiKFuFKdLlcSCF5MUY/L0319oUpI98hNMQVv0yjn/tjj9+Wr9AWaMCp/3lJh7ydtMrl+2jpLm/t0Ub8SnbnYvcB8hGFOpCO4Py1QqvIhyvWGVDgJqG61yB1SgwgYVogxlT4LKgE71SSBd/g==
  • Authentication-results: ucsd.edu; dkim=none (message not signed) header.d=none; ucsd.edu; dmarc=none action=none header.from=northwestern.edu;
Dave is right. NetCDF can do automatic type conversion when the data
types of memory buffer mismatches with the variable stored in the file.
This appears to be a bug in NetCDF.

I notice you posted this as github issue #1826, but got no response yet.
https://github.com/Unidata/netcdf-c/issues/1826

Wei-keng

On Sep 9, 2020, at 9:01 AM, David Pierce via netcdfgroup 
<netcdfgroup@xxxxxxxxxxxxxxxx<mailto:netcdfgroup@xxxxxxxxxxxxxxxx>> wrote:

Hi Sylvain,

thanks very much for the thoughts! What is odd is that I agree with your 
observation that the nc_get_var_double() is acting exactly as you say -- 
reading a float into a double, yielding incorrect results -- *but* according to 
the documentation the function call is supposed to convert the read-in data 
into double. E.g., from the netcdf man page that comes with the source tarball:

int nc_get_var_double(int ncid, int varid, double in[]):

              Reads  an  entire  netCDF  variable  (i.e. all the values).  The
              netCDF dataset must be open and in data mode.  The data is  con-
              verted from the external type of the specified variable, if nec-
              essary, to the type specified in the function name.  If  conver-
              sion is not possible, an NC_ERANGE error is returned.

So it seems to unambiguously say that the data should be converted to double 
using this call, and that kind of error should not happen. Hence my confusion.

Regards,

--Dave


On Tue, Sep 8, 2020 at 5:47 PM Sylvain Herlédan 
<sylvain.herledan@xxxxxxxxxxxxxxxx<mailto:sylvain.herledan@xxxxxxxxxxxxxxxx>> 
wrote:

Hi David,

I think you must use the nc_get_var_*() function matching the storage type of 
the variable, these methods are not meant to be used for type conversions.

Your NetCDF file states that the data type for the phycocyanin variable is 
float, so you should use nc_get_var_float to read its values.

On my system float values are stored on 32 bits whereas double values are 
stored on 64 bits.

When you ask the library to read the content of the variable as double, it 
loads 64 bits from the file and interpret them as a double value. Since 
phycocyanin values are stored on 32 bits, it means that you actually read two 
float values from the file but interpret their binary representation as a 
single double value.

It explains the strange values you get with nc_get_var_double:
 - the variable only contains fill values, i.e. -9999.f
 - on my machine the hexadecimal representation for -9999.f is 0xc61c3c00
 - if you read two adjacent floats with a -9999.f value, you get  
0xc61c3c00c61c3c00.
 - if you interpret these 64 bits as a double value you get:  
-559239646634519513659653226496.000000

In your test program if you print more values of ddata you will also see that 
you get the strange value for the 720 first items in the array, and then 
starting from the 721st you only get 0: since you actually need two phycocyanin 
values for each 64bits item of the array, you only have enough data to fill the 
first half of ddata.

Cheers,

Sylvain

On 09/09/2020 00:30, David Pierce via netcdfgroup wrote:
Hello netcdf-ers,

I had a user of the R ncdf4 package alert me to an odd and perplexing apparent 
bug in the netcdf library. It is triggered by the netcdf file you can download 
here:

http://cirrus.ucsd.edu/~pierce/tmp/mendota_buoy.2018-11-08.nc<https://urldefense.com/v3/__http://cirrus.ucsd.edu/*pierce/tmp/mendota_buoy.2018-11-08.nc__;fg!!Dq0X2DkFhyF93HkjWTBQKhk!A05Cbui_1stmn39pSzE29ZU1VqPUYndSF1HsO4w_upDQcd9v9Txg8X_ofIQQmtY0mv96$>

This file has the following variable (among others):

float phycocyanin(time) ;
            phycocyanin:_FillValue = -9999.f ;
            phycocyanin:units = "RFU" ;
            phycocyanin:long_name = "Phycocyanin" ;
            phycocyanin:_Storage = "chunked" ;
            phycocyanin:_ChunkSizes = 1440 ;
            phycocyanin:_DeflateLevel = 4 ;
            phycocyanin:_Shuffle = "true" ;
            phycocyanin:_Endianness = "little" ;

'ncdump' reports all the data as missing:

phycocyanin = _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, (etc.)

When I read the data into a floating point array using the C library using 
nc_get_var_float() then I get the expected values of -9999.0. However when I 
read it into a double precision array with nc_get_var_double() then I get 
strange values. Here is the output of the little test program that I've 
appended below:

Sizeof float: 4 double: 8
======== DOUBLE ========
-559239646634519513659653226496.000000 -559239646634519513659653226496.000000 
-559239646634519513659653226496.000000 -559239646634519513659653226496.000000 
-559239646634519513659653226496.000000 -559239646634519513659653226496.000000 
-559239646634519513659653226496.000000 -559239646634519513659653226496.000000 
-559239646634519513659653226496.000000 -559239646634519513659653226496.000000
======== FLOAT ========
-9999.000000 -9999.000000 -9999.000000 -9999.000000 -9999.000000 -9999.000000 
-9999.000000 -9999.000000 -9999.000000 -9999.000000

This seems like a bug, unless I'm overlooking something obvious in the test 
code (always possible). The nc_get_var_double() call is supposed to convert the 
netcdf file's values into double precision and store them in the provided 
double precision array, right? That's my understanding anyway. I can't see why 
they are not -9999.00's in the resultant array.

Any thoughts are appreciated,

--Dave

Test program:

#include <stdio.h>
#include <stdlib.h>
#include "netcdf.h"

void main( int argc, char *argv[] )
{
        int     err, ncid, varid;
        size_t  nt;
        double  *ddat;
        float   *fdat;

        nt = 1440;      /* just hardcode for test */

        printf( "Sizeof float: %ld  double: %ld\n", sizeof(float), 
sizeof(double) );

        err = 
nc_open("mendota_buoy.2018-11-08.nc<https://urldefense.com/v3/__http://mendota_buoy.2018-11-08.nc__;!!Mih3wA!TzxceiDbm9bJ2PTB0rzFvI7Byrs9UxiWq2RbLDodthnzVgeMLwAghucDQsiNSho$>",
 0, &ncid );
        if( err != 0 ) {
                printf( "err open = %d\n", err );
                exit(-1);
                }

        err = nc_inq_varid( ncid, "phycocyanin", &varid );
        if( err != 0 ) {
                printf( "err inq_varid = %d\n", err );
                exit(-1);
                }

        /* Make room for both double and floating dat */
        ddat = (double *)malloc( sizeof(double) * nt );
        fdat = (float  *)malloc( sizeof(float ) * nt );

        /* Read into double array. Supposed to convert to double...? */
        err = nc_get_var_double( ncid, varid, ddat );
        if( err != 0 ) {
                printf( "err get_var_double = %d\n", err );
                exit(-1);
                }

        printf( "======== DOUBLE ========\n" );
        for( int ii=0; ii<10; ii++ )
                printf( "%lf  ", ddat[ii] );

        printf( "\n" );

        /* === Do same thing, but for float === */

        err = nc_get_var_float( ncid, varid, fdat );
        if( err != 0 ) {
                printf( "err get_var_float = %d\n", err );
                exit(-1);
                }

        printf( "======== FLOAT ========\n" );
        for( int ii=0; ii<10; ii++ )
                printf( "%f  ", fdat[ii] );

        printf( "\n" );

}



-------------------------------------------------------------------
David W. Pierce
Division of Climate, Atmospheric Science, and Physical Oceanography
Scripps Institution of Oceanography
(858) 534-8276 (voice)  /  (858) 534-8561 (fax)    dpierce@xxxxxxxx
-------------------------------------------------------------------


_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are
recorded in the Unidata inquiry tracking system and made publicly
available through the web.  Users who post to any of the lists we
maintain are reminded to remove any personal information that they
do not want to be made public.


netcdfgroup mailing list
netcdfgroup@xxxxxxxxxxxxxxxx<mailto:netcdfgroup@xxxxxxxxxxxxxxxx>
For list information or to unsubscribe,  visit: 
https://www.unidata.ucar.edu/mailing_lists/<https://urldefense.com/v3/__https://www.unidata.ucar.edu/mailing_lists/__;!!Mih3wA!TzxceiDbm9bJ2PTB0rzFvI7Byrs9UxiWq2RbLDodthnzVgeMLwAghucD0oOpb3g$>



--

-------------------------------------------------------------------
David W. Pierce
Division of Climate, Atmospheric Science, and Physical Oceanography
Scripps Institution of Oceanography
(858) 534-8276 (voice)  /  (858) 534-8561 (fax)    
dpierce@xxxxxxxx<mailto:dpierce@xxxxxxxx>
-------------------------------------------------------------------

_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are
recorded in the Unidata inquiry tracking system and made publicly
available through the web.  Users who post to any of the lists we
maintain are reminded to remove any personal information that they
do not want to be made public.


netcdfgroup mailing list
netcdfgroup@xxxxxxxxxxxxxxxx<mailto:netcdfgroup@xxxxxxxxxxxxxxxx>
For list information or to unsubscribe,  visit: 
https://urldefense.com/v3/__https://www.unidata.ucar.edu/mailing_lists/__;!!Dq0X2DkFhyF93HkjWTBQKhk!A05Cbui_1stmn39pSzE29ZU1VqPUYndSF1HsO4w_upDQcd9v9Txg8X_ofIQQmqT7yoSi$

  • 2020 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: