Re: Bug in ncdump or ncgen ?

Hi,

Ata (ATAE@xxxxxxxxxxxxxxxxxxxxxxxxxxx) writes:

> I generated the cdl file appended at the end of this message using ncdump 
> (lets call it sample.cdl). When I try ncgen sample.cdl I get:
> 
>       sample.cdl line 7: syntax error
> 
> It appears its beacause it doesn't like spaces in names since I can fix the
> cdl file by changing:
> 
>       Universal Time to Universal_Time

Yes, this is an opportunity to point out that it is possible to create
netCDF files with netCDF library calls that ncdump and ncgen cannot handle
correctly, and to explain why.  First, here is what the netCDF User's Guide
says about CDL names:

    CDL names for variables, attributes, and dimensions may be any
    combination of alphabetic or numeric characters as well as `_' and `-'
    characters, but names beginning with `_' are reserved for use by the
    library.  Case is significant in CDL names.  The netCDF library does not
    enforce any restrictions on netCDF names, so it is possible (though
    unwise) to define variables with names that are not valid CDL names.
    The names for the primitive data types are reserved words in CDL, so the
    names of variables, dimensions, and attributes must not be type names.

Since the netCDF library puts no restrictions on names (except that they
must be shorter than MAX_NC_NAME characters) you can even create netCDF
files that use names containing punctuation, control characters, and
non-ASCII bytes.  The CDL data description language, however, requires more
restrictive names to make it possible to parse CDL stements.  As an example
of the potential parsing difficulties, if you named a variable `p(time)',
then it would be ambiguous whether the following was a CDL declaration of
the scalar variable `p(time)' or a 1-dimensional variable `p' that used the
`time' dimension:

    float p(time) ;

Similarly, names that begin with digits are parsed in CDL as numeric
constants.

A perverse programmer could use new lines and semicolons in netcdf variable
names to create a netCDF file that, when dumped with ncdump, would look like
CDL statements that had nothing to do with the contents of the file.

To get around such possibilities, we could add to the library a check when
defining a name that the name conforms to the same regular expression for
names used in CDL parsing (in ncgen/ncgen.l)

    [A-Za-z_][A-Za-z_0-9-]*

but someone may want to write a new data description language for netCDF
someday that permits a larger subset of names, or there may be users who
don't use ncdump of ncgen that are already using more general names, e.g.
with `.' in them.  Thus adding a new restriction on names at the library
level might break existing applications.

>       Universal Time = UNLIMITED ; // (4 currently)

The specific problem with this variable name in CDL is that it contains a
blank.  You could use `Universal-Time' or `Universal_Time'.

>               instrument:00_name = "00_Flux Gate Magnetometer" ;
>               instrument:01_acronym = "01_FGM" ;

The specific problem with these attributes names in CDL are that they start
with a digit rather than an alphabetic character.

________________________________________________________________________
Russ Rew                                        Unidata Program Center
russ@xxxxxxxxxxxxxxxx                           UCAR, PO Box 3000
                                                Boulder, CO 80307-3000