[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: NetCDF Performance Enhancement



Hi Bob,

> I was talking to Jim Shofstahl at Finnigan corp about a performance
> enhancement I made to the NetCDF library. He suggested I pass on the
> information to you.
> 
> For my application it resulted in about a 50x speed increase for large data
> sets. When I profiled NetCDF I discovered that all the time was being spent
> reading arrays of floats. I found that when you read an array of floats
> NetCDF would actually read one value at a time and convert it. The
> modification I made reads the whole array and converts it in place. Here is
> the function I changed. Search for HERE IT IS to find the modification.

Yes, thanks, we've already made a similar modification in netCDF 3, which
we hope to release later this summer.  The netCDF library must work on
platforms that don't use IEEE floating-point representations for
floating-point and double precision values (e.g. VAX and Cray machines), so
your optimization isn't available for such platforms if the vendor-supplied
XDR library is used.  Glenn Davis (whom I've CC:ed on this reply) has
written netCDF 3 to be independent of XDR, so that optimization is now
available (and used) on all platforms.

> I have also made other modifications to NetCDF to make it re-entrant, which
> it is not now. This involved converting the cdfid to a pointer to a
> structure rather than an index into an array of structures.
> 
> Note: A similar modification could be made for writing values, and for
> reading and writing other data types.

I'll pass this on to Glenn for him to evaluate (but he's away from his mail
until July).  We're interested in making netCDF calls thread-safe, which is
related to making the library re-entrant.  In rewriting the library for
version 3, Glenn has managed to remove the globals, except where a
backwards-compatibility interface is being used.

Anyway, thanks for your comments and the appended code changes.

______________________________________________________________________________

Russ Rew                                           UCAR Unidata Program
address@hidden                              http://www.unidata.ucar.edu


> *
>  * xdr 'count' items of contiguous data of type 'type' at 'where'
>  */
> static bool_t xdr_NCvdata( XDR *xdrs , u_long where ,nc_type type , unsigned
> count , Void *values )
> {
>       u_long rem = 0 ;
>       typedef bool_t (*_xdr_NC_fnct)(XDR *, Void *) ;
>       _xdr_NC_fnct xdr_NC_fnct;
>       bool_t stat ;
>       Size_t szof ;
> 
>       switch(type){
>       case NC_BYTE :
>       case NC_CHAR :
>       case NC_SHORT :
>               rem = where%4 ;
>               where -= rem ; /* round down to nearest word */
>               break ;
>       }
>       if( !xdr_NCsetpos(xdrs, where) )
>               return(FALSE) ;
> 
>       switch(type){
>       case NC_BYTE :
>       case NC_CHAR :
>               if(rem != 0)
>               {
>                       unsigned vcount = MIN(count, 4 - rem) ;
>                       if(!xdr_NCvbyte(xdrs, (unsigned)rem, vcount, values) )
>                               return(FALSE) ;
>                       values += vcount ;
>                       count -= vcount ;
>               }
> 
>               rem = count%4 ; /* tail remainder */
>               count -= rem ;
>               if(!xdr_opaque(xdrs, values, count))
>                       return(FALSE) ;
> 
>               if(rem != 0)
>               {
>                       values += count ;
>                       if( !xdr_NCvbyte(xdrs, (unsigned)0, (unsigned)rem ,
>                               values) )
>                               return(FALSE) ;
>                       return(TRUE) ;  
>               } /* else */
>               return(TRUE) ;
>       case NC_SHORT :
>               if(rem != 0)
>               {
>                       if(!xdr_NCvshort(xdrs, (unsigned)1, (short *)values) )
>                               return(FALSE) ;
>                       values += sizeof(short) ;
>                       count -= 1 ;
>               }
>               rem = count%2 ; /* tail remainder */
>               count -= rem ;
>               if(!xdr_shorts(xdrs, (short *)values, count))
>                       return(FALSE) ;
>               if(rem != 0)
>               {
>                       values += (count * sizeof(short)) ;
>                       return( xdr_NCvshort(xdrs, (unsigned)0,
>                               (short *)values) ) ;
>               } /* else */
>               return(TRUE) ;
>       case NC_LONG :
>               xdr_NC_fnct = (_xdr_NC_fnct)xdr_long ;
>               szof = sizeof(long) ;
>               break ;
>       case NC_FLOAT :
>       /*
>         HERE IT IS! 
>         This bypasses a lot of the XDR code to improve performance. 
>         Does not work on a vax
>        */
>      #ifndef vax
>         if( xdrs->x_op == XDR_DECODE )
>         {
>             unsigned bytes = count * sizeof(float);
>             /* read everything in a single pass */
>             stat = XDR_GETBYTES(xdrs, values, count * sizeof(float));
>             #ifndef mc68000
>             if( stat != FALSE )
>             {
>                /* decode all of the values at once */
>                for( long *lp = (long *)values; count > 0; count--, lp++)
>                   *lp = ntohl(*lp);
>             }
>             #endif
>            return stat;
>          }
>      #endif
>               xdr_NC_fnct = (_xdr_NC_fnct)xdr_float ;
>               szof = sizeof(float) ;
>               break ;
>       case NC_DOUBLE :
>               xdr_NC_fnct = (_xdr_NC_fnct)xdr_double ;
>               szof = sizeof(double) ;
>               break ;
>       default :
>               return(FALSE) ;
>       }
>       for(stat = TRUE ; stat && (count > 0) ; count--)
>       {
>               stat = (*xdr_NC_fnct)(xdrs,values) ;
>               values += szof ;
>       }
>       return(stat) ;
> }
> -BOB-