[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Character strings in NetCDF...



> Date: Thu, 17 Jun 1999 12:28:23 -0400
> From: Patrice Cousineau <address@hidden>
> Subject: Character strings in NetCDF...
> To: address@hidden

Hi Patrice,

> I'm wondering what is the best way to deal with character string data
> in NetCDF (mainly metadata). It seems very clumsy to have to declare a
> string as an array of characters and need to declare a dimension for
> string length. Furthermore, the processing of these arrays becomes
> very tedious and not very useful to other NetCDF tools.

Yes, unfortunately that is the price we pay for supporting a Fortran
interface.  From the section "Reading and Writing Character String
Values" in the Users Guide:

    Character strings are not a primitive netCDF external data type,
    in part because FORTRAN does not support the abstraction of
    variable-length character strings (the FORTRAN LEN function
    returns the static length of a character string, not its dynamic
    length). As a result, a character string cannot be written or read
    as a single object in the netCDF interface. Instead, a character
    string must be treated as an array of characters, and array access
    must be used to read and write character strings as variable data
    in netCDF datasets. Furthermore, variable-length strings are not
    supported by the netCDF interface except by convention; for
    example, you may treat a zero byte as terminating a character
    string, but you must explicitly specify the length of strings to
    be read from and written to netCDF variables.

For the relatively small strings that occur in metadata, we often just
declare one string length (for example 80) and use that for all the
character strings, wasting some space for shorter strings, and
explicitly terminating variable-length strings with a null byte.
Another approach is to use netCDF attributes instead of variables for
such string data, since then no explicit lengths need to be declared.

> The only solution I have found is to assign an integer ID to these
> strings (assuming there are a limited number of possibilities) and
> creating a lookup table for them. But then, where and how would i
> store the lookup tables? ...

You can store such a lookup table in a fixed-size netCDF character
variable, dimensioned large enough to hold all your variable-size
character strings:

 dimensions:
   stringsLen = 1000;   // sufficiently large for all metadata strings
   numStrings = 100;    // maximum number of strings in table
 variables:
   char strings(stringsLen);      // strings table
   int stringIndices(numStrings); // where each string starts in table
   ...

You might also store the string lengths if they are likely to ever
shrink.  This is crude and tedious, as you point out, but can be made
a little easier by adding a small interface that reads and writes such
strings and hides the representation.

  int putString(char* s); // appends string s to table, returns string number
  char *getString(int i); // gets string number i from table

Of course this only works if you won't be growing any of the strings
later, which would require copying them to the end and
garbage-collecting the gap left.

>                      ...  And what to do with an infinite list of
> strings???

If you can't anticipate a maximum for how many strings will be needed
for the metadata or what maximum aggregate storage will be required,
you have a problem for which netCDF may not be appropriate.  An
unlimited list of fixed-size strings can be handled with the unlimited
dimension, but an unlimited list of variable-size strings does not fit
well with the netCDF data model, which is designed to support fast
direct access to array-oriented data, where the seek offset of the
data from the beginning of the file can be computed in a fixed amount
of time.

--Russ

_____________________________________________________________________

Russ Rew                                         UCAR Unidata Program
address@hidden                     http://www.unidata.ucar.edu