[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 970122: comments on netCDF v3



>To: address@hidden
>From: Robert Fischer <address@hidden>
>Subject: comments on netCDF v3
>Keywords: 199701221547.IAA02331

Hi Bob,

> i use netCDF version 2.4, and am interested in the continued development
> of netCDF.  i have some comments about potential future versions:

Thanks for your thoughtful suggestions.  We will carefully consider your
comments along with others we have received.

>  1) you've eliminated the use of generic void pointers in the ncdf v3. 
> although this is good from a "user's" point of view, it renders impossible
> some things i'm doing now.  i have a generic multi-dimensional array class
> in C++, which i have interfaced with netCDF.  it's very convenient, since
> variables can be simply and easily "copied" (general array section copy)
> from file to memory, and back again.  along with some other routines which
> allow me to "initialize"  memory arrays the same size as netCDF arrays and
> vice versa, as well as to define netCDF arrays without having to tediously
> define every dimension, etc, i have quite a convenient interface. 
> 
> needless to say, eliminating void pointers would put quite a damper on my
> ability to implement my array class.  please retain a void* interface.

Although the generic void* pointers are eliminated from the public
documented interface, they are still available in the library.  The
following declarations excerpted from src/libsrc/nc.h are the generic
interfaces for nc_{put,get}_att, and nc_{put,get}_var{1,a,s,m,}:

 /*
  * These functions are used to support
  * interface version 2 backward compatiblity.
  * N.B. these are tested in ../nc_test even though they are
  * not public. So, be careful to change the declarations in
  * ../nc_test/tests.h if you change these.
  */

 extern int
 nc_put_att(int ncid, int varid, const char *name, nc_type datatype,
         size_t len, const void *value);

 extern int
 nc_get_att(int ncid, int varid, const char *name, void *value);

 extern int
 nc_put_var1(int ncid, int varid, const size_t *index, const void *value);

 extern int
 nc_get_var1(int ncid, int varid, const size_t *index, void *value);

 extern int
 nc_put_vara(int ncid, int varid,
          const size_t *start, const size_t *count, const void *value);

 extern int
 nc_get_vara(int ncid, int varid,
          const size_t *start, const size_t *count, void *value);

 extern int
 nc_put_vars(int ncid, int varid,
          const size_t *start, const size_t *count, const ptrdiff_t *stride,
          const void * value);

 extern int
 nc_get_vars(int ncid, int varid,
          const size_t *start, const size_t *count, const ptrdiff_t *stride,
          void * value);

 extern int
 nc_put_varm(int ncid, int varid,
          const size_t *start, const size_t *count, const ptrdiff_t *stride,
          const ptrdiff_t * map, const void *value);

 extern int
 nc_get_varm(int ncid, int varid,
          const size_t *start, const size_t *count, const ptrdiff_t *stride,
          const ptrdiff_t * map, void *value);

While this may solve the immediate problem of permitting you to
implement your array classes over netCDF-3, the problem of what to do in
netCDF-4 and subsequent versions remains.  In netCDF-4, we intend to
support "packed" external data types such as arrays of 11-bit values.
Such an external type has no natural internal type associated with it.
It is always converted to or from a specified unpacked internal type when
it is accessed.  A user who accesses the values as doubles need not know
or care that the values are actually represented externally as packed
11-bit values, but can determine this is the type of the variable using
an nc_inq_var function.  The use of the void* generic interfaces won't
make much sense for such packed data types.

>  2) i would find an ability to remove variables and dimensions from the
> file to be very useful.  i realize that removing dimensions is not always
> possible, if the dimensions are being used; an error in that case would be
> fine.  removing variables should always be possible (i realize it would
> require rewriting the entire file; that's OK).

Permitting removal of variables and dimensions causes a problem in the
C and FORTRAN interfaces: after a variable with variable ID 1 is
removed, are the variable IDs of subsequent variables decremented?  If
so, some programs that assume variable IDs never change break; if not,
programs that assume all variables have consecutive variable IDs
(e.g. ncdump) break.  In C++ and Java, this is not a problem, since
variable IDs are not needed as object identifier surrogates, but I don't
see a way around this in C and FORTRAN ...

>  3) an ability to read & write arrays of arbitrary data types (you just
> give it the size of the type).  i realize that the XDR stuff could not be
> used for such an operation, since netCDF would not know how to interpret
> the bits involved.  however, i think decisions about the non-portability
> inherent in such a use should be left to the application programmer.  as a
> fix for this problem, you could let the programmer supply a subroutine,
> presumeably based on XDR, which convert the data type correctly between

You can currently read and write such arrays by just treating them as a
bland array of bytes, in a variable of type NC_BYTE.  Any
application-specific conversion for such types may be implemented in a
layer above netCDF.  I'm not sure what advantage would be gained by
putting this in the netCDF library, since the library would have to
indirectly invoke a user-supplied function or method on every value on
each access, and the user would have to define the type by registering
a function or method pointer as well as a size.  Maybe it would be more
convenient?

As far as I know, yours is the first request we've had for such a
capability.  It might be a way to implement the packed data types in an
extensible way ...

>  4) allow hierarchical variables.  i have some data structures which are
> stored as a few netCDF arrays.  for example, a compressed row-format (cmr) 
> matrix consists of three arrays, called pntre, indx, and val.  if i want
> to write a cmr matrix called "damp" to a netCDF file, I have to do it
> using three seemingly separate variables, as follows: 
> 
> netcdf t6-mingrad {
> dimensions:
>         damp-nrow = 362 ;
>         damp-ncol = 362 ;
>         damp-nel = 72522 ;
> variables:
>         long damp-pntre(damp-nrow) ;
>         long damp-indx(damp-nel) ;
>         double damp-val(damp-nel) ;
>  
> // global attributes:
>                 :damp-type = "mingrad" ;
>                 :tessa-file = "t6.cdf" ;
>                 :remark = "2-D Damping matrix for spherical spline
> tesselation, degree 6" ;
> }
> 
> i would like to have netCDF take care of this hierarchical stuff for me,
> and be able to write my damping matrix as one sub-file, like:
> 
> netcdf t6-mingrad {
> dimensions:
>         nrow = 362 ;
>         ncol = 362 ;
>         nel = 72522 ;
> variables:
>       netcdf damp {
>       variables:
>               long pntre(nrow) ;
>               long indx(nel) ;
>                       :remark = "attribute on damp.indx"
>               double val(nel) ;
> 
>               // global attributes for damp
>                 :type = "mingrad" ;
>                 :remark = "2-D Damping matrix for spherical spline
>       };
>       int funny(nrow);
>  
>       // global attributes:
>         :tessa-file = "t6.cdf" ;
> }
> 
> standard lexical scoping naming rules could be used to determine what a
> name means at a certain position in the CDL file.

This is a very interesting suggestion, and will require more study ...

Thanks again for your comments.

--Russ

_____________________________________________________________________

Russ Rew                                         UCAR Unidata Program
address@hidden                     http://www.unidata.ucar.edu