[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Russ Rew: Re: C++/question



Hi Tomas,

I'm sorry to have taken so long to answer your question about the size of
netCDF files produced by the C and C++ interface.  Investigating the problem
revealed a bug that will be fixed in the next release.

The explanation is that the

    NcVar::add_att( NcToken attname, const char* )

function invoked in netcdf/c++/example.cc, used to define string attributes,
as in

    P->add_att("units", "hectopascals");

stores the string attributes with the trailing "\0" character counted as
part of the attribute value.  Here's the relevant code from the
NcVar::add_att member function in netcdf/c++/netcdf.cc:

    if (ncattput(the_file->id(), the_id, aname, (nc_type) ncChar,
                 strlen(val) + 1, val) == ncBad)

The C version in example.c provides explicit lengths, and doesn't include
the trailing "\0" character, for example:

   ncattput (ncid, P_id, "units", NC_CHAR, 12,
             (void *)"hectopascals");

Hence the C++ version is storing an extra character, the trailing "\0", for
every string attribute.

When ncdump reads and prints a string attribute, it doesn't include any
trailing null byte, since that is assumed to be the end-of-string marker
from C.  Hence ncdump will print exactly the same attribute value for a
four-character attribute value "abc\0" as it will for a three-character
attribute value "abc".

I think the behavior of ncdump is OK in this respect, although it means
running ncdump and then ncgen on a file containing attributes with trailing
nulls will strip the trailing nulls, so the resulting file will be smaller
than the original.  The NetCDF User's Guide recommends:

    In C, fixed-size strings may be written to a netCDF file without the
    terminating null byte, to save space.  Variable-length strings should be
    written @emph{with} a terminating null byte so that the intended length
    of the string can be determined when it is later read.
  ...
    In FORTRAN, fixed-size strings may be written to a netCDF file without a
    terminating character, to save space.  Variable-length strings
    should follow the C convention of writing strings with a terminating
    null byte so that the intended length of the string can be determined
    when it is later read by either C or FORTRAN programs.

so it does not require the terminating null byte.

I can fix the inconsistency you have uncovered in either of two ways:

 1.  Change the c++/example.c code so that it includes the trailing null
     byte in the attribute length for all string attributes.

 2.  Change the code for NcVar::add_att( NcToken attname, const char* ) so
     that it doesn't store the trailing null byte.

I prefer the second fix, but in trying it, I just noticed it requires a
rewrite of the NcValues_char::print(ostream&) member function in
ncvalues.cc.  I've added that to my list of things to do before the
alpha-test version of netCDF 2.4 is ready.

Anyway, thanks for being persistent in asking about this problem, even
though I was apparently ignoring it the first time you asked.  You have
uncovered a bug that we will fix.

--Russ

______________________________________________________________________________

Russ Rew                                           UCAR Unidata Program
address@hidden                              http://www.unidata.ucar.edu