[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Netcdf



>Organization: PNNL
>Keywords: 199606120126.AA17210

Brian,

> For mwr.txt NCGEN returns the error:
> 
>       NCGEN.EXE: mwr.txt line 32: parse error

It turns out that this error comes from the "lex" utility used to generate
ncgenyy.c from ncgen.l.  The "lex" program builds parsers that can't handle
lines longer than 1024 characters, which your "status_flag" variable
exceeded.  This limit is apparently different for "lex" on other
platforms, but I was unaware of this limitation.  If you split the CDL
data statement:

   status_flag = "000...000" ;  // 1440 zeros

into something like the following instead:

   status_flag = "000...000",   // 1000 zeros
                 "000...000" ;  //  440 zeros

then the lex limit will not be encountered.  Since I can't change "lex", it
looks like fixing this will require making "ncdump" output substrings that
don't exceed the built-in lex limit.  But note that splitting the string up
to work around this problem reveals another bug with ncgen, described below
and demonstrated by your other example ...

> For vceil25.txt NCGEN returns the error:
> 
>       NCGEN.EXE: vceil25.txt line 341: string won't fit in this variable, 0>1

Thanks for making the text CDL files and binary netCDF files available to
demonstrate this ncgen bug.  From diagnosing what is going on with your 2
Mbyte file, I determined that following tiny CDL file will also demonstrate
the bug:

    netcdf bug {
    dimensions:
            time = UNLIMITED ;
    variables:
            char var(time) ;
    data:
            var = "0123";
    }

The problem is with character variables that use only the record dimension,
such as the variable "var" above, or again your variable "status_flag"
which is dimensioned only by time.  This is an unusual variable
declaration, because the "char" type is intended for character strings (the
"byte" type is available for arrays of 8-bit numeric values).  The above
works fine with ncgen when the character string is separated into
individual characters, as in:

    netcdf bug {
    dimensions:
            time = UNLIMITED ;
    variables:
            char var(time) ;
    data:
            var = "0", "1", "2", "3";
    }

which is also a work-around for the bug in correctly reading the values of
your "status_flag" variable:

 status_flag = "0","0","0", ... ,"0" ; // 1440 "0"s, can be on multiple lines

The bug here is really in ncdump, since it should output character strings
that vary only along the record dimension in the above form, rather than as
long character strings.  I will be developing a fix for both the way ncdump
outputs such variables and the way ncgen parses the old ncdump output,
however. 

In the meantime, the ways to avoid this bug are to use NC_BYTE instead of
NC_CHAR types for 8-bit numeric values such as "status_flag" that vary only
along the record dimension, or to separate the output of ncdump for such
variables into individual characters for each record.

Thanks for being persistent about reporting this bug.

--Russ

______________________________________________________________________________

Russ Rew                                           UCAR Unidata Program
address@hidden                              http://www.unidata.ucar.edu