[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 19980902: NetCDF Java package



>From: John Begenisich <address@hidden>
>Subject: NetCDF Java package
>Organization: Northwest Research Associates, Inc.
>Keywords: 199809022136.PAA01918 netCDF Java attributes
>
> Hello,
>
> I have just began working with the NetCDF Java API, and I'd like to say
> that it appears to be an excellent package.  There is one little thing,
> however, that I would like to have changed, and that is the fact that
> the data in an Attribute cannot be changed after construction.
>
> I have a definite need to do this, and although it does not seem
> Java-like to be able to do this, it is not against NetCDF standards to
> do so.
>
> I have just subscribed to the NetCDF-Java mailing list, and I hope to
> become active in testing (and perhaps contributing to) the NetCDF Java
> package.
>
> Sincerely,
> John Begenisich
>
>
> --
> John Begenisich
> Northwest Research Associates, Inc.
> (425) 644-9660 x328
> 14508 NE 20th St.  Bellevue, WA 98007
> http://www.nwra.com

The discerning reader will note other aspects of a netcdf which are
immutable when using the java interface. For example, once a netcdf
is constructed, you can't add variables, dimensions or attributes.
There is no "define mode".

This is conscious design decision, which is difficult to succinctly justify,
especially to folks who are used to the C and FORTRAN interface. I'll give it
a shot.

The main reason has to do with multiprocessing and multithreading.
Imagine multiple processes (via the rmi service, perhaps) or multiple
threads accessing the same netcdf. If we are to allow changes to the netcdf
definition, we must ensure that they occur in a consistant fashion. This is
certainly possible. However, the necessary locking and synchronization
have performance impacts. Given the fact that is a little utilized feature,
we would prefer to take the performance hit only when the feature is being
used, rather than in all cases. We do this by making the netcdf definition
(the 'schema') immutable. You can still change the definition, by creating a
new netcdf with a modified schema and copying the data from the old to the new.
This is exactly what happens when one performs the 'redef' 'enddef' operations
using the C interface, except that programmers don't realize how much it
costs unless they read the fine print. In the Java interface, programmers see
the cost.

It sounds like youagree with the above, but still want to modify the
_values_ of attributes. It is possible in the C interface to modify the
value of an attribute without redefinition _if_ the amount of space required
by the attribute value does not change. To me, this is flexibility is too
tightly coupled to a particular netcdf implementation and file format.
Logically, it has pitfalls. For example, consider changing the value of the
"units" attribute from "feet" to "meters". The same sort of consistancy and
synchronization issues as above come into play. "_FillValue" is even worse!

This design decision is a big win with the RMI (or any other distributed)
implementation. With the current design, it is possible to safely copy and
cache the header (schema) information into a client. With this information
local to the client, the client can perform many common operations and sanity
checks without incurring the round trip rpc cost. If it were possible to modify
attributes, there would have to be protocol for notifying clients of the
change, or every attribute lookup would have to go to the server.

In the future, we plan on defining Java classes for scientific data which
have persitant form as netcdf files. A particular Class would
map to a particular netcdf Schema; instances of the Class would all have the
same Schema, but potentially different data. Having immutable Schema fits well
with these plans.

Hope this clarifies things.

-glenn