[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ncgen & ncdump



Harvey,

Sorry if you got this twice, but Steve tells me he didn't get it at all, and
I got some sort of cryptic message from sendmail when I first sent it on
Saturday ... 

> Using following cdl file with ncgen:
> 
> netcdf f {
> variables:
>     byte bF;
>         bF:_FillValue = '\xff';
>     byte b;
>     short s;
>     int i;
>     float f;
>     double d;
> }
> 
> gives file f.nc, which ncdump dumps as:
> 
> netcdf f {
> variables:
>         byte bF ;
>                 bF:_FillValue = '\377' ;
>         byte b ;
>         short s ;
>         long i ;
>         float f ;
>         double d ;
> data:
>  bF = _ ;
>  b = 129 ;
>  s = _ ;
>  i = _ ;
>  f = _ ;
>  d = _ ;
> }
> 
> 1. 'b' (byte using default _FillValue) should print as '_' like the others.

No, this is the intended and documented behavior:

    If you need a fill value for a byte variable, it is recommended that you
    explicitly define an appropriate @code{_FillValue} attribute, as generic
    utilities such as @code{ncdump} will not assume a default fill value for
    byte variables.

Since `b' is of type byte and does not have an explicitly specified
_FillValue, it is assumed that all its 256 values are significant.  Assuming
a default fill value for displaying the data of a byte variable when none is
specified would be wrong in the common case where the user needs to use all
256 values and would not want any of them displayed as `_'.  Only the byte
data type is treated this way because it has so few values.

> 2. It seems inconsistent that _FillValue cannot be specified as decimal
>    number (e.g. 129) & is printed in octal, but byte data is printed as
>    decimal number (e.g. b = 129  above).  
>    I suggest:
>    (a) ncgen should allow _FillValue to be specified as decimal number

CDL has no syntax for declaring the types of attributes explicitly.  The
type of an attribute is inferred from the type of value given to the
attribute.  ncgen currently *does* allow you to specify _FillValue for a
byte variable as a decimal number, but in that case, the type of the
attribute will be either short or long, depending on how the attribute is
specified:

     byte bF;
         bF:_FillValue  = '\xff'; // a byte attribute
         bF:satt        = -127S;  // a short attribute
         bF:latt        = -127;   // an nclong attribute
         bF:Latt        = -127L;  // also an nclong attribute

See the User's Guide section on "CDL Notation for Data Constants" for the
syntax details, which allow using "s" or "S" suffixes for short constants
and "l" or "L" constants for integer or nclong constants.  This syntax is
never needed for variable data, because the type of variables is explicitly
declared in CDL, but the type of attributes is not.

>    (b) ncdump should print all (attribute or var) byte values as decimal 
> numbers

If it did this, it would change the type of the attributes from byte to
nclong, so ncgen would no longer be the inverse of ncdump.  Maybe what we
need is a more transparent syntax for byte constants, say with a "b" or "B"
appended.  This might be a little confusing, because CDL supports a notation
for hexadecimal constants that uses "b" or "B" for a hex digit, but there is
really no ambiguity, since hex constants begin with "0x" or "0X".

>    (c) I suggest these should be signed. I hope the issue of whether bytes
>        should be signed or unsigned will be clarified by fact that my packed
>        data proposal includes simple unscaled unsigned n-bit integers (which
>        includes unsigned 8-bit).

Currently ncdump presents values for byte variables as signed integers
on platforms where C chars are signed (Sun, Digital, AIX, HP, ...), and as
unsigned chars on platforms where C chars are unsigned (SGI, Cray, ...).
The ncgen program always interprets these two representations for the same
bit pattern correctly, so the values are preserved, no matter how they are
displayed.  

Changing ncdump to always display byte values as signed would be a change
for platforms such as SGIs and Crays.  The current behavior derives from the
original interpretation of the netCDF "byte" type as corresponding to the C
"char" type, where the integer range depended on the platform.  We now know
that that was a mistake.  I think it should be corrected in netCDF 3.0, but
I don't think it should be changed as part of a minor release from netCDF
2.4 to netCDF 2.4.1.

--Russ