[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Fwd: Documentation for backslash encoding of String metadata]



I think we are talking about different things.
I am pretty sure that that the things I described were a direct result of changing from netCDF-java 2.2.22 to netCDF-java 4.0. I ran my unit tests with 2.2.22, changed to 4.0, re-ran the tests and saw the difference that I reported.

I don't see how what I saw could have been something that changed in 1993. Perhaps you are talking about things in the C-based code and I am talking about the java-based code and the NCdump Java class.

Thanks for looking at it though.

Russ Rew wrote:
Bob,

I'm working on switching to netCDF-java 4.0. Sorry for the delay.

With netCDF-java 4.0, there seems to have been a change in the way
special characters are encoded in attribute values and when they appear
in ncdump.

I couldn't find a mention of this at
http://www.unidata.ucar.edu/software/netcdf-java/v4.0/CHANGES
or documentation at
http://www.unidata.ucar.edu/software/netcdf-java/v4.0/javadoc/index.html
or
http://www.unidata.ucar.edu/software/netcdf/docs/netcdf.html

Are the details documented somewhere?
As far as I know, we haven't changed the way special characters are
encoded in attribute values in the C-based interfaces.  I may have
overlooked something, but could you give me an example of an attribute
value that's displayed differently by ncdump in netCDF-4 than it was by
a previous version of ncdump?
Yes, if I use NCdump.print in netcdfJava 2.2.22 (and previous), characters like ' " and newline exist as single characters. (Actually, the " was troublesome because the entire String attribute was displayed with " at the beginning and end -- it should have been encoded.

Yes, if I use NCdump.print in netcdfJava 4.0, characters like ' and newline (and probably ", but I haven't tested it yet) exist as two characters, backslash plus the character (or n for newline) as in a Java or JSON-encoded String. ' looks odd because possessive words now have an internal backslash: e.g., Bob\'s newline looks odd because it takes away the visual formatting that occurs in the attribute (e.g., a history attribute with a separate line for each processing step).

You're right that escaping the apostrophe doesn't seem necessary in CDL
for attribute string values, but I just verified that that particular
escape has been generated by ncdump since at least version 2.3.2, first
released in 1993, and perhaps versions previous to that.  It should have
been documented in the User's Guide section on CDL Notation for Data
Constants:

  http://www.unidata.ucar.edu/netcdf/docs/netcdf.html#CDL-Constants

but I see that it's not mentioned there.  Although it might be
considered a bug, I think you're the first to point it out.  The
original reason for the escape was probably for single character
constants, which use the CDL notation

  ownership = 'B', 'o', 'b', '\'s';

and that was carried over to the string notation unnecessarily.  The
ncgen utility parses it correctly, so it's only the CDL representation
that's wrong.  At this point, I'm reluctant to change it, because it
would also require changes in ncgen, several of the tests, and any other
user or commercial software that depends on CDL.

--Russ


Sincerely,

Bob Simons
IT Specialist
Environmental Research Division
NOAA Southwest Fisheries Science Center
1352 Lighthouse Ave
Pacific Grove, CA 93950-2079
Temporary phone number (831)648-0623 (don't leave a message if I'm not in)
[someday, I will again use my permanent phone number (831)658-3205]
address@hidden

The contents of this message are mine personally and
do not necessarily reflect any position of the
Government or the National Oceanic and Atmospheric
Administration.
<>< <>< <>< <>< <>< <>< <>< <>< <><