Georeferencing with Java: An Example of Executable Metadata
Russell K. Rew
Unidata Program Center
University Corporation for Atmospheric Research
Boulder, Colorado
1. INTRODUCTION
Metadata is data about data; it describes data and
facilitates the use of data by others, especially when it is
integrated with the data it describes. Examples of meta-
data include the units in which the data values are repre-
sented and information that provides a way to associate
a time or geospatial location with each data value (geo-
referencing metadata). This paper describes the imple-
mentation of a prototype for executable metadata, one
possible approach to making scientific data more useful
for visualization and analysis applications.
The georeferencing prototype described here pro-
vides information about the location of data in a portable
executable form, as an array of bytes that can be recon-
stituted into a Java object that implements the georefer-
encing functionality needed by applications. The
georeferencing object provides a way to package trans-
formations between geospatial coordinates and data
indices. It supports writing applications that use a simple,
general, and abstract interface for obtaining location
information about data. This approach is not practical
with languages such as C++, C, or Fortran, because
these languages provide no portable representation for
executable content. What is proposed falls short of a
complete object-oriented architecture, as described, for
example, in Vckovski (1996) in which all data is provided
as objects rather than bits; instead, this prototype pack-
ages only some useful metadata into object form.
One problem with more traditional georeferencing
standards is lack of complete support for the wide variety
of ways for representing geospatial information in the
geosciences. The ingenuity and creativity of data provid-
ers in inventing new ways to represent geospatial infor-
mation compactly tends to confound efforts to anticipate
all needs in a single standard. For example, the "quasi-
regular thinned grids" used by NCEP in packaging AVN
model output data in GRIB form, as described in Dey
(1996), present a challenge to other georeferencing
models.
Georeferencing objects provide a more comprehen-
sive and flexible representation. If a georeferencing
scheme can be implemented in software, it can be
encapsulated in executable metadata representing a
georeferencing object.
"Smart data" interfaces such as the one described
here may improve support for object-oriented applica-
tions. Unidata hopes to ultimately help develop some of
the infrastructure and software to make practical an
effective division of responsibilities between data and
applications, and to improve access to scientific data
used for research and education in the atmospheric sci-
ences. Experiences with a Java prototype for platform-
independent data georeferencing make this approach
appear practical for Java applications.
2. THE PROBLEM
The general problem addressed is how to deal with
the complexity of data from diverse sources in visualiza-
tion and analysis applications by moving much of the
data-specific complexity from applications to the data
itself. The specific problem dealt with by the current pro-
totype is packaging only one aspect of the data-specific
complexity, geospatial information, into a form that is por-
table, general, accurate, secure, compact, efficient, and
extensible. Client applications should be able to deal with
new datasets that use new forms of executable georefer-
encing without requiring changes to the application.
In this proof-of-concept prototype, we have limited
the problem to a simple representation of transforma-
tions between planar grid coordinates (two-dimensional
indices) and surface latitude-longitude coordinates.
Hence, the problem is simplified to:
· determining the (lat, lon) location correspond-
ing to any (i, j) point of a planar grid
· determining the real (i, j) grid indices corre-
sponding to any (lat, lon) location
The first capability makes it possible to plot data
defined on the grid on a map display. The second capa-
bility supports determining grid indices corresponding to
a mouse click on a map display. Together, these transfor-
mations provide a simple and general interface for use in
applications. A few other convenience methods are also
needed to determine the shape of the associated georef-
erenced grid.
A more complete interface might include vertical
and time coordinates, handle one-dimensional domains
(e.g. station data or trajectories) and deal with higher
dimensional grids with data-dependent coordinate trans-
formations. The issues encountered in implementing and
evaluating the simpler two-dimensional case, however,
are thought to be representative of the more general
case.
3. IMPLEMENTATION ISSUES
The prototype Java application we implemented,
MapGeoGrid, reads the data and metadata from a spec-
ified dataset and plots the georeferenced grids associ-
ated with variables in the dataset on a world image. For
this purpose, we adapted the MapApplet package
described in Callahan (1997), writing a new MapTool
subclass, and providing a custom class loader. Source
for the MapGeoGrid prototype application is available
from .
Security is an important issue, even in implementing
simple class methods that only perform arithmetic com-
putations to implement coordinate transformations. In
Java, classes loaded with a custom class loader are in a
separate name space from local classes, to prevent
some potential security problems. Running the georefer-
encing methods in an applet security context would be
sufficient to protect the client environment, but applets
cannot use a custom class loader required by this tech-
nique. Thus an applet must download the georeferencing
class from its remote server; an application may use
other Java security mechanisms, for example digital sig-
natures to authenticate the origin or association of exe-
cutable bytes with source code.
Since loaded classes and application classes can-
not share names, a common pre-defined interface that is
a super-class of the loaded class is used by the applica-
tion. When the custom class is loaded, an object of that
class is constructed and cast into an instance of the
common superclass it implements. Then methods of this
instance are invoked to perform georeferencing opera-
tions on the data.
Performance was completely adequate in the proto-
type implementation, because expensive operations in
setting up the transformations are only performed once,
when the constructor for the georeferencing class is
invoked. Performance could be further enhanced by
including methods in the abstract interface for transform-
ing one- and two-dimensional arrays of grid locations
and earth locations, instead of only providing point-at-a-
time methods. A default implementation using loops
invoking the point-at-a-time methods would minimize the
burden on data providers who did not choose to take
advantage of these optimizations.
4. USAGE
In a typical usage scenario, a data provider would
package georeferenced data by providing with the data
the portable byte codes for a Java class that implements
a GeoGrid interface to the data.
Data users would access the data and the execut-
able metadata together. Mechanisms for sharing georef-
erencing metadata among multiple variables or datasets,
as well as for multiple metadata variables per dataset
would be the same as for conventional metadata. The
users' applications, written in terms of the abstract Geo-
Grid interface, would read the byte codes for the particu-
lar subclass of GeoGrid associated with the dataset
using the custom class loader developed for this proto-
type, instantiate an object of that class, and use the
methods of the resulting object for georeferencing. The
same uniform GeoGrid class methods would be used for
data from multiple sources, but different data-specific
georeferencing methods would actually be invoked for
different GeoGrid objects.
For purposes of evaluation of the prototype, we
chose as a first example of geodata some model output
data from NCEP, distributed in GRIB form on the
National Weather Service High Resolution Data Service.
The NCEP "211grid," a 93 by 65 regional Lambert Con-
formal grid over the continental U.S., has non-trivial
transformations between index space and latitude-longi-
tude. We converted a collection of GRIB products that
use this grid into a single netCDF file in which the georef-
erencing information was stored in a byte array variable,
referenced by name by data variables defined on the
grid.
5. EVALUATION
The prototype was only designed to test the practi-
cality and usefulness of the idea of executable metadata
and to uncover any unanticipated problems or issues
that implementation of the idea would reveal. The results
are mixed: we end up with "half-objects" that can be
used as ordinary data files by conventional applications
using traditional data access interfaces, but that also
behave like simple georeferencing objects for Java appli-
cations that activate the executable content. But for the
data to be useful to conventional applications, the geo-
referencing data must also be represented convention-
ally.
5.1 Benefits
Benefits to using this technique for implementing
executable metadata, and georeferencing metadata in
particular, may include:
· simpler application interfaces to georeferenced
data;
· portable executable metadata for Java-
equipped platforms in distributed environments;
· applications immune to changes in georefer-
encing;
· reduced possibility for misinterpreting location
of data, since georeferencing is implemented
once by the data supplier rather than many
times by the data users; and
· compact representation for complex georefer-
encing, because size does not increase with
spatial resolution.
As a concrete example of the compactness achiev-
able with Java byte codes, the class data needed for exe-
cutable metadata for the "211 grid" required about 2000
bytes. In contrast, representing the grid as a two dimen-
sional array of single precision latitudes and longitudes
requires over 48,000 bytes, and efficiently supports only
one of the two transformations between index space and
latitude-longitude space.
Another potential advantage of this approach in a
transition to object-oriented data access is that the inter-
faces and specific class implementations of the inter-
faces required can later be incorporated into full-blown
data objects.
5.2 Limitations and Remaining Problems
The primary limitation of this technique is its require-
ment that applications that make use of the metadata
must be written in Java. (Something like the CORBA
infrastructure might provide interfaces for applications
written in other languages, but access to a Java Virtual
Machine would still be required to execute the georefer-
encing methods.)
If only Java applications can use the executable
content of such data, why not provide the data as actual
objects of a class that implements the abstract georefer-
encing interface, rather than as bits containing a repre-
sentation of such a class? A pure object-oriented
approach would package all appropriate methods with
the data. The proposed approach may be more practical
in the short run, because data providers may be reluc-
tant to also become code providers (and maintainers) to
the extent necessary to provide the classes that wrap
their data as actual objects. The more gradual approach
of refining a standard applications interface for one
aspect of metadata at a time may be a more realistic way
to eventually achieve the benefits of object-oriented data
and applications.
Another limitation is lack of any simple way to
browse or search metadata represented only in execut-
able form. Information about data coverage and resolu-
tion is only available by invoking the methods of the
metadata.
Traditional ways to represent georeferencing infor-
mation have included:
· simple but rigid standards (e.g. all data must be
on regular latitude-longitude grids);
· conventions specified in standard documents
separate from the data;
· specialized data formats that anticipate and
provide compact representations for some
common ways to represent geodata.
None of these traditional approaches provides the
extensibility and power of executable metadata, but they
have other advantages. Language-independence is an
important characteristic in environments where other
languages are widely used for scientific applications.
Language-independence is probably also necessary for
long-term archives, since useful data may out-last Java
or any other particular language. It is prudent to insist
that the Java sources for executable metadata be stored
with any archives that include such data.
6. CONCLUSIONS
Storing georeferencing functions with the data for
execution when and where the data is accessed avoids
the need for applications to support elaborate conven-
tions for parameterizing many different kinds of georefer-
encing. Instead, applications may assume a simple
interface for accessing georeferenced data, and depend
on the data to realize specific implementations of that
interface. Whether these benefits are enough to out-
weigh the limitation that only Java applications may
make use of such metadata remains to be determined.
If this approach is useful, it may also be applied to
other kinds of complex metadata, such as calibration,
interpolation algorithms, derivative calculations, error
estimates, and the representation of irregular domains.
Object-oriented approaches to data access enhance the
practicality of developing interoperable applications that
can better deal with the complexity of multiple forms of
scientific data. Executable metadata may be useful as an
incremental approach to achieving some of the benefits
of object-oriented architectures for scientific data access,
visualization, and analysis.
7. REFERENCES
Callahan, J., S. Hankin, J. Davison, 1997. "Improving
Web Access To Gridded Data: Java Tools for Cli-
mate Data Servers," 13th International Conference
on Interactive Information and Processing Systems
for Meteorology, Oceanography, and Hydrology,
Anaheim, California, Am. Meteor. Soc., 189-191.
Dey, C. H., Office Note 388, GRIB (edition 1): The WMO
Format for the Storage of Weather Product Informa-
tion and the Exchange of Weather Product Mes-
sages," 1996. NCEP
Vckovski, A., F. Bucher, 1996. "Virtual Data Sets - Smart
Data for Environmental Applications," The Third
International Conference/Workshop on Integrating
GIS and Environmental Modeling, Sante Fe, NM,
January 1996. .
GEOREFERENCING WITH JAVA:
AN EXAMPLE OF EXECUTABLE METADATA
Russell K. Rew*
Unidata Program Center
University Corporation for Atmospheric Research
Boulder, Colorado