Meeting about improving the GRD API.
Ian Barrodale
ian at barrodale.com
Fri Feb 9 11:47:25 MST 2007
Thanks very much for this detailed response John.
Currently at BCS we're having a very interesting discussion about the
way forward, and I've placed a call to Ted (no return call yet) to
get his opinion on a satellite application that would become feasible
by combining the strengths of the various technologies
involved. More later today....
Regards,
Ian
==================
At 08:13 AM 2/9/2007, John Caron wrote:
>Hello all, comments are in-line:
>
>Ian Barrodale wrote:
>>Hi Ted, John, Russ, and John:
>>Thank you all for taking the time yesterday to both listen to our
>>story and to further enlighten us about your work. It was much appreciated.
>>The note below provides a possible implementation route, and some
>>questions. Please feel free to point out any shortcomings in our
>>proposed approach, and please provide any answers that come to mind
>>regarding our questions.
>>Thanks again,
>>Ian
>>======================
>>
>>Goal
>>-------
>>Based on feedback from BCS Grid DataBlade customers and, in
>>particular, Ted Habermann, we feel that there may be some value in
>>providing alternate ways of accessing data from a Grid DataBlade
>>(GRD) - powered database through existing widely-used protocols and
>>methods. Note that by "accessing", we really mean just the
>>reading part, as we already provide, through the BCS Gridded Data
>>Loader client, a means of conveniently ingesting data from many
>>forms into a GRD-powered database. One method of accessing the
>>data would be to cast it in the form of the Common Data Model
>>(CDM) supported by the Java netCDF API from UCAR. The advantage
>>of this is that:
>> * users would be able to write software using the Java netCDF API
>> (which is fairly straightforward to use and well documented) for
>> accessing GRD data, and
>> * data providers can use a GRD-powered database and provide access
>> to it through OPeNDAP, WCS, netCDF files, etc. using the Java
>> netCDF API (see page 53 attachment, modified from the slide on
>> page 53 of
>> http://www.unidata.ucar.edu/staff/caron/presentations/CDM.ppt).
>>Our understanding of a possible implementation
>>---------------------------------------------------------------------
>>To handle GRD data from the Java netCDF API, we would have to:
>>(i) Create a GRD I/O service provider for the Java netCDF API (see
>>page 38 attachment) that can communicate with the GRD database
>>using a combination of JDBC and the existing Java GRD API. The
>>Java netCDF API uses a service provider architecture to handle
>>reading multiple different file formats and casting them in the
>>form of the CDM.
>>(ii) Create a GRD content manager to handle the georeferencing
>>information in the GRD.
>>One possible method for allowing users to access GRD data without a
>>full THREDDS catalog is to supply some type of unique URL to the database:
>> grd://user:pass@server/database
>>and the service provider would construct a CDM instance that
>>contains a main group of all the grids in the database and allow
>>the user to access those grids through the API.
>>For example:
>> grd://peter:test123@omni.barrodale.com/coastwatch
>>might be a reference to a GRD database running at Barrodale that
>>contains gridded NOAA CoastWatch satellite-derived data for some
>>number of geographic areas and time periods. The resulting netCDF
>>dataset would be one that contains a list of grids under a root
>>group like a directory structure:
>> /
>> /sst/
>> /sst/northeast/
>> /sst/northeast/jan01_2007 <---- a grid
>> /sst/northeast/jan02_2007 <---- another grid
>> ...
>> /chlorophyll/northeast/jan01_2007 <---- a third grid
>> /chlorophyll/northeast/jan02_2007 <---- and so on
>>It depends on the desired complexity of the grids in the database
>>as to whether the user would require a more sophisticated catalog
>>with querying ability such as that which THREDDS could supply.
>
>see the last answer below.
>
>BTW, the TDS will soon have the ability to do proper HTTP-based
>authentication, and we are hoping to make that a standard in OPenDAP
>clients, which can act like browsers and pop up a username/password
>dialog window, instead of embedding the user:pass@ in the URL.
>
>>Questions
>>---------------
>>We have the following questions:
>>1) Where in the netCDF API would the content manager that handles
>>GRD georeferencing information sit?
>>2) How does the I/O SP architecture determine the I/O SP for a given
>>file:// <file://\> style URL? How would it know to handle a grd:// URL
>>differently?
>
>Very perceptive question; let me start here to explain these 2 questions:
>
>The IOSP architecture is, in fact (RandomAccessFile) file based.
>Since you will be URL based, we have to fit you in at a higher
>level, namely NetcdfDataset.openFile(). If you look there you will
>see that we look for opendap (http: or dods:) and thredds: URLs. It
>might makes sense to generalize this to allow plugging in external
>handlers for your protocol, similar to how java.net.ContentHandler
>works. Otherwise we might put your code in the core, which is also a
>possibility.
>
>Anyway, NetcdfDataset.openFile() would detect your URL scheme and
>call NetcdfFile with your IOSP. We will have to add a new
>constructor for that. (You could alternately just subclass
>NetcdfFile, which is what DODSNetcdfFile does).
>
>As for the "content manager that handles GRD georeferencing
>information". It could be a CoordSysBuilder subclass. However, this
>is actually unnecessary if you use an existing Convention, and we
>would highly recommend using the CF Convention for gridded data.
>Since you are creating the "file", you can add the attributes and
>variables needed by that Convention. This makes your data "CF
>compliant" automatically, which is a real win.
>
>>3) Have we interpreted the slide on page 53 correctly -- is there a
>>server that can serve out data using the CDM (via the Java netCDF
>>API) as an intermediate step?
>
>yes, the THREDDS Data Server
>
>>4) Does a group structure to represent GRD contents map to an
>>OPeNDAP connection, WCS, or netCDF file or do those types of data
>>representations only have netCDF variables and no groups?
>
>In principle you could use Groups, but they really wont be fully
>supported until we get the netcdf-4 file format finished and tested.
>I would advise to start with the simpler case of no groups.
>
>>5) Our understanding of the netCDF Java library is that it has, in
>>particular, the following two entry points:
>> * NetcdfFile : this is the bare netCDF access to files of various
>> types. It doesn't understand anything about coordinate systems.
>> You can add an I/O service provider to handle your favorite file
>> format via a class method. The variables it returns are instances
>> of Variable (which of course don't know anything about coordinate
>> systems).
>> * NetcdfDataset : this is a layer built above the NetcdfFile layer
>> and is the usual interface for applications (e.g., a WCS). It
>> handles converting various attributes into a coordinate system. It
>> has a number of methods relating to adding or getting coordinate
>> systems. These methods seem to be applied to the entire file,
>> rather than to individual variables (or groups).
>
>coordinate systems are really variable-specific. however the common
>case is that each dataset has a single coordinate system (or a set
>of closely related ones).
>
>
>> CoordinateSystem
>>
>><http://www.unidata.ucar.edu/software/netcdf-java/v2.2.18/javadoc/ucar/nc2/dataset/CoordinateSystem.html>
>> *findCoordinateSystem*
>>
>><http://www.unidata.ucar.edu/software/netcdf-java/v2.2.18/javadoc/ucar/nc2/dataset/NetcdfDataset.html#findCoordinateSystem%28java.lang.String%29>(
>> java.lang.String name) // Retrieve the CoordinateSystem
>> with the specified name.
>> java.util.List *getCoordinateAxes*
>> <http://www.unidata.ucar.edu/software/netcdf-java/v2.2.18/javadoc/ucar/nc2/dataset/NetcdfDataset.html#getCoordinateAxes%28%29>()
>>
>> // Get the list of all CoordinateAxis objects used by
>> this dataset.
>> java.util.List * getCoordinateTransforms *
>> <http://www.unidata.ucar.edu/software/netcdf-java/v2.2.18/javadoc/ucar/nc2/dataset/NetcdfDataset.html#getCoordinateTransforms%28%29>
>> ()
>> // Get the list of all CoordinateTransform objects used
>> by this dataset.
>> boolean * getCoordSysWereAdded *
>> <http://www.unidata.ucar.edu/software/netcdf-java/v2.2.18/javadoc/ucar/nc2/dataset/NetcdfDataset.html#getCoordSysWereAdded%28%29>
>> ()
>> // Has Coordinate System metadata been added.
>>The NetcdfDataset object contains instances of VariableDS. They are
>>like a wrapper for the Variable objects found in the NetcdfFile
>>object. There is a method to ask a VariableDS for the list of
>>coordinate systems associated with it.
>
>exactly
>
>>If we interpret things correctly , when a NetcdfDataset object is
>>built from a NetcdfFile object, the NetcdfDataset object is
>>responsible for figuring out the coordinate system information from
>>attributes in the NetcdfFile, and composing a VariableDS from the
>>coordinate system information and each Variable. In theory, by
>>implementing our own CoordSysBuilder class and registering it, we
>>should be able to add coordinate system information to each
>>VariableDS individually.
>
>yes, or as i mentioned use an existing Convention and CoordSysBuilder.
>
>
>>A question then is : do applications like the web coverage server
>>and OPeNDAP server get their coordinate information from VariableDS
>>objects or from the NetcdfDataset object?
>
>
>OPenDAP is (more or less) at the same level as NetcdfFile, and so
>just faithfully transmits Variables, Attributes, and Dimensions
>across the wire. The coordinate systems then are added by clients
>(like CDM) that understand the convention. We are expecting that
>DAP4, the future opendap protocol, will add Groups.
>
>WCS, OTOH, works at the coordinate system level, and so uses the
>GridDatatype, which is specialized for "coverage" data, and gets its
>coordinates systems from NetcdfDataset. The clent makes requests in
>coordinate space, and we know how to translate that into index
>space. Currently we can send back either geoTiff or netcdf/CF files.
>There are some limittions- the grid spacing must be uniform in WCS
>1.0. We expect to move to WCS 1.1 later this year, which removes
>that limitation. We havent implemented reprojection/resampling, and
>im not sure that we will.
>
>>If it is from the NetcdfDataset object, then the strategy of
>>grouping all the grids in a database into a single NetcdfDataset,
>>as outline above, won't work, and we'd be obliged to use a THREDDS
>>server. Is this correct?
>
>It would likely be a mistake to put a lot of disparate data into the
>same NetcdfDataset. Better to find the right granularity, which is
>typically homogenous data that shares the same discovery
>metadata. So I would not use the Group mechanism to break the data
>into granules, better to make seperate datasets. Its possible that
>such an idiom will develop with Netcdf-4, but better to get
>something working that stays within existing practice, then decide
>if you want to forge ahead. Let me emphasize that its really
>important to find the right dataset granularity.
>
>This means you want to use THREDDS catalogs to publish the dataset
>URLs and associated metadata, and possibly use TDS to serve your
>data. Once you had an IOSP or equivilent for your data, the main
>work is to develop the catalogs. These can be pretty minimal, but
>automatically populating catalogs with high-quality metadata is a
>huge win in the long run.
>
>I think that would be a powerful value-added product, but of course
>i dont know what your customers really want. As Ted mentioned, its a
>good time to help influence TDS strategy, and it appears to me that
>your small company with extensive scientific experience would be a
>good fit with Unidata.
>
>John
**********************************************
Ian Barrodale, Ph.D.
President
Barrodale Computing Services Ltd.
Tel: (250) 472-4372 Fax: (250) 472-4373
Web: http://www.barrodale.com
Email: ian at barrodale.com
**********************************************
Mailing Address:
P.O. Box 3075 STN CSC
Victoria BC Canada V8W 3W2
Shipping Address:
Hut R, McKenzie Avenue
University of Victoria
Victoria BC Canada V8W 3W2
**********************************************
More information about the Netcdf-java
mailing list