Unidata - To provide the data services, tools, and cyberinfrastructure leadership that advance Earth system science, enhance educational opportunities, and broaden participation. Unidata
         
  advanced  
 
© Copyright 1998 American Meteorological Society. To appear in the Proceedings of the Fourteenth International Conference on Interactive Information and Processing Systems for Meteorology, Oceanography, and Hydrology, Phoenix, Arizona, American Meteorology Society, January 1998. The AMS does not guarantee that the copy provided here is an accurate copy of the published work.

GEOREFERENCING WITH JAVA:
AN EXAMPLE OF EXECUTABLE METADATA


Russell K. Rew*
Unidata Program Center
University Corporation for Atmospheric Research
Boulder, Colorado



1. INTRODUCTION


Metadata is data about data; it describes data and facilitates the use of data by others, especially when it is integrated with the data it describes. Examples of metadata include the units in which the data values are represented and information that provides a way to associate a time or geospatial location with each data value (georeferencing metadata). This paper describes the implementation of a prototype for executable metadata, one possible approach to making scientific data more useful for visualization and analysis applications.

The georeferencing prototype described here provides information about the location of data in a portable executable form, as an array of bytes that can be reconstituted into a Java object that implements the georeferencing functionality needed by applications. The georeferencing object provides a way to package transformations between geospatial coordinates and data indices. It supports writing applications that use a simple, general, and abstract interface for obtaining location information about data. This approach is not practical with languages such as C++, C, or Fortran, because these languages provide no portable representation for executable content. What is proposed falls short of a complete object-oriented architecture, as described, for example, in Vckovski (1996) in which all data is provided as objects rather than bits; instead, this prototype packages only some useful metadata into object form.

One problem with more traditional georeferencing standards is lack of complete support for the wide variety of ways for representing geospatial information in the geosciences. The ingenuity and creativity of data providers in inventing new ways to represent geospatial information compactly tends to confound efforts to anticipate all needs in a single standard. For example, the "quasi-regular thinned grids" used by NCEP in packaging AVN model output data in GRIB form, as described in Dey (1996), present a challenge to other georeferencing models.

Georeferencing classes provide a more comprehensive and flexible representation. If a georeferencing scheme can be implemented in software, it can be encapsulated in executable metadata representing a georeferencing class.

"Smart data" interfaces such as the one described here may improve support for object-oriented applications. Unidata hopes to ultimately help develop some of the infrastructure and software to make practical an effective division of responsibilities between data and applications, and to improve access to scientific data used for research and education in the atmospheric sciences. Experiences with a Java prototype for platform-independent data georeferencing make this approach appear practical for Java applications.

2. THE PROBLEM


The general problem addressed is how to deal with the complexity of data from diverse sources in visualization and analysis applications by moving much of the data-specific complexity from applications to the data itself. The specific problem dealt with by the current prototype is packaging only one aspect of the data-specific complexity, geospatial information, into a form that is portable, general, accurate, secure, compact, efficient, and extensible. Client applications should be able to deal with new datasets that use new forms of executable georeferencing without requiring changes to the application.

In this proof-of-concept prototype, we have limited the problem to a simple representation of transformations between planar grid coordinates (two-dimensional indices) and surface latitude-longitude coordinates. Hence, the problem is simplified to:

The first capability makes it possible to plot data defined on the grid on a map display. The second capability supports determining grid indices corresponding to a mouse click on a map display. Together, these transformations provide a simple and general interface for use in applications. A few other convenience methods are also needed to determine the shape of the associated georeferenced grid.

A more complete interface might include vertical and time coordinates, handle one-dimensional domains (e.g. station data or trajectories) and deal with higher dimensional grids with data-dependent coordinate transformations. The issues encountered in implementing and evaluating the simpler two-dimensional case, however, are thought to be representative of the more general case.

3. IMPLEMENTATION ISSUES


The prototype Java application we implemented, MapGeoGrid, reads the data and metadata from a specified dataset and plots the georeferenced grids associated with variables in the dataset on a world image. For this purpose, we adapted the MapApplet package described in Callahan (1997), writing a new MapTool subclass, and providing a custom class loader. Source for the MapGeoGrid prototype application is available from <URL:http://unidata.ucar.edu/staff/russ/MapGeoGrid/>.

Security is an important issue, even in implementing simple class methods that only perform arithmetic computations to implement coordinate transformations. In Java, classes loaded with a custom class loader are in a separate name space from local classes, to prevent some potential security problems. Running the georeferencing methods in an applet security context would be sufficient to protect the client environment, but applets cannot use a custom class loader required by this technique. Thus an applet must download the georeferencing class from its remote server; an application may use other Java security mechanisms, for example digital signatures to authenticate the origin or association of executable bytes with source code.

Since loaded classes and application classes cannot share names, a common pre-defined interface that is a super-class of the loaded class is used by the application. When the custom class is loaded, an object of that class is constructed and cast into an instance of the common superclass it implements. Then methods of this instance are invoked to perform georeferencing operations on the data.

Performance was completely adequate in the prototype implementation, because expensive operations in setting up the transformations are only performed once, when the constructor for the georeferencing class is invoked. Performance could be further enhanced by including methods in the abstract interface for transforming one- and two-dimensional arrays of grid locations and earth locations, instead of only providing point-at-a-time methods. A default implementation using loops invoking the point-at-a-time methods would minimize the burden on data providers who did not choose to take advantage of these optimizations.

4. USAGE


In a typical usage scenario, a data provider would package georeferenced data by providing with the data the portable byte codes for a Java class that implements a GeoGrid interface to the data.

Data users would access the data and the executable metadata together. Mechanisms for sharing georeferencing metadata among multiple variables or datasets, as well as for multiple metadata variables per dataset would be the same as for conventional metadata. The users' applications, written in terms of the abstract GeoGrid interface, would read the byte codes for the particular subclass of GeoGrid associated with the dataset using the custom class loader developed for this prototype, instantiate an object of that class, and use the methods of the resulting object for georeferencing. The same uniform GeoGrid class methods would be used for data from multiple sources, but different data-specific georeferencing methods would actually be invoked for different GeoGrid objects.

For purposes of evaluation of the prototype, we chose as a first example of geodata some model output data from NCEP, distributed in GRIB form on the National Weather Service High Resolution Data Service. The NCEP "211grid," a 93 by 65 regional Lambert Conformal grid over the continental U.S., has non-trivial transformations between index space and latitude-longitude. We converted a collection of GRIB products that use this grid into a single netCDF file in which the georeferencing information was stored in a byte array variable, referenced by name by data variables defined on the grid.

5. EVALUATION


The prototype was only designed to test the practicality and usefulness of the idea of executable metadata and to uncover any unanticipated problems or issues that implementation of the idea would reveal. The results are mixed: we end up with "half-objects" that can be used as ordinary data files by conventional applications using traditional data access interfaces, but that also behave like simple georeferencing objects for Java applications that activate the executable content. But for the data to be useful to conventional applications, the georeferencing data must also be represented conventionally.

5.1 Benefits


Benefits to using this technique for implementing executable metadata, and georeferencing metadata in particular, may include:

As a concrete example of the compactness achievable with Java byte codes, the class data needed for executable metadata for the "211 grid" required about 2000 bytes. In contrast, representing the grid as a two dimensional array of single precision latitudes and longitudes requires over 48,000 bytes, and efficiently supports only one of the two transformations between index space and latitude-longitude space.

Another potential advantage of this approach in a transition to object-oriented data access is that the interfaces and specific class implementations of the interfaces required can later be incorporated into full-blown data objects.

5.2 Limitations and Remaining Problems


The primary limitation of this technique is its requirement that applications that make use of the metadata must be written in Java. (Something like the CORBA infrastructure might provide interfaces for applications written in other languages, but access to a Java Virtual Machine would still be required to execute the georeferencing methods.)

If only Java applications can use the executable content of such data, why not provide the data as actual objects of a class that implements the abstract georeferencing interface, rather than as bits containing a representation of such a class? A pure object-oriented approach would package all appropriate methods with the data. The proposed approach may be more practical in the short run, because data providers may be reluctant to also become code providers (and maintainers) to the extent necessary to provide the classes that wrap their data as actual objects. The more gradual approach of refining a standard applications interface for one aspect of metadata at a time may be a more realistic way to eventually achieve the benefits of object-oriented data and applications.

Another limitation is lack of any simple way to browse or search metadata represented only in executable form. Information about data coverage and resolution is only available by invoking the methods of the metadata.

Traditional ways to represent georeferencing information have included:

None of these traditional approaches provides the extensibility and power of executable metadata, but they have other advantages. Language-independence is an important characteristic in environments where other languages are widely used for scientific applications. Language-independence is probably also necessary for long-term archives, since useful data may out-last Java or any other particular language. It is prudent to insist that the Java sources for executable metadata be stored with any archives that include such data.

6. CONCLUSIONS


Storing georeferencing functions with the data for execution when and where the data is accessed avoids the need for applications to support elaborate conventions for parameterizing many different kinds of georeferencing. Instead, applications may assume a simple interface for accessing georeferenced data, and depend on the data to realize specific implementations of that interface. Whether these benefits are enough to outweigh the limitation that only Java applications may make use of such metadata remains to be determined.

If this approach is useful, it may also be applied to other kinds of complex metadata, such as calibration, interpolation algorithms, derivative calculations, error estimates, and the representation of irregular domains. Object-oriented approaches to data access enhance the practicality of developing interoperable applications that can better deal with the complexity of multiple forms of scientific data. Executable metadata may be useful as an incremental approach to achieving some of the benefits of object-oriented architectures for scientific data access, visualization, and analysis.

7. REFERENCES


Callahan, J., S. Hankin, J. Davison, 1997. "Improving Web Access To Gridded Data: Java Tools for Climate Data Servers," 13th International Conference on Interactive Information and Processing Systems for Meteorology, Oceanography, and Hydrology, Anaheim, California, Am. Meteor. Soc., 189-191. <http://dread.pmel.noaa.gov/~callahan/Talks/AMS-97/paper.html>

Dey, C. H., Office Note 388, GRIB (edition 1): The WMO Format for the Storage of Weather Product Information and the Exchange of Weather Product Messages," 1996. NCEP <ftp://nic.fb4.noaa.gov/pub/nws/nmc/docs/>

Vckovski, A., F. Bucher, 1996. "Virtual Data Sets - Smart Data for Environmental Applications," The Third International Conference/Workshop on Integrating GIS and Environmental Modeling, Sante Fe, NM, January 1996. <http://bbq.ncgia.ucsb.edu:80/conf/SANTA_FE_CD-ROM/sf_papers/vckovski_andrej/vbpaper.html>.


* Corresponding author address: Russ Rew, UCAR Unidata, P.O. Box 3000, Boulder, CO 80307-3000; e-mail <russ@unidata.ucar.edu>. The Unidata Program Center is sponsored by the National Science Foundation and managed by the University Corporation for Atmospheric Research.
 
 
  Contact Us     Site Map     Search     Terms and Conditions     Privacy Policy     Participation Policy
 
National Science Foundation (NSF) UCAR Office of Programs University Corporation for Atmospheric Research (UCAR)   Unidata is a member of the UCAR Office of Programs, is managed by the University Corporation for Atmospheric Research, and is sponsored by the National Science Foundation.
P.O. Box 3000     Boulder, CO 80307-3000 USA     Tel: 303-497-8643     Fax: 303-497-8690