Re: Meeting about improving the GRD API.

Ian,

Sounds fine to me, so long as that means I can have access to the L2 (swath) 
data.  I am quite 
happy to project the data myself; the essential thing is that most geophysical 
parameters 
such as Chla must be derived from the various frequencies of water leaving 
radiance before 
the data is projected; a similar concern applies to SST and cloudmasks.  

cheers
Dave


----- Original Message -----
From: Ian Barrodale <ian@xxxxxxxxxxxxx>
Date: Friday, February 9, 2007 4:11 pm
Subject: Re: Meeting about improving the GRD API.

> Hi All:
> 
> Based on your feedback (John and Ted), we've been talking and have
> an interesting project idea - brought up today by Peter.
> 
> If we were able to put GRD database support into the Java netCDF
> library and use THREDDS and TDS to serve data, we may
> also be able to provide gridded data reprojection support.
> 
> What do  you think of this idea please?  Some details follow below.
> 
> If a grd:// URL was allowed to contain a bit of spatial reference text
> so that not only did it specify the database server, login info, 
> databasename, and variable name, but also the projection and its 
> parameters(sort of like an OPeNDAP subset specification) then the 
> GRD could
> reproject any of the gridded data (including satellite swath data such
> as from a POES satellite) to a projection of the user's choice.  The
> spatial reference text could be tacked onto the grd:// URL in the 
> recordsreturned by THREDDS according to some number of pre-specified
> projections that we create.
> 
> Such a capability would help to support users like Dave Foley and
> Roy Mendelssohn (copied) who have expressed an interest to us in 
> storingunmapped satellite data in a spatial database, as well as other
> groups such as the NOAA/NESDIS CLASS archive.  According to Peter,
> the NOAA CoastWatch technical group has been talking for years about
> how nice it would be to be able to pull unmapped satellite data out 
> of CLASS
> in a user- specified projection rather than having to create mapped 
> products.
> Have a great weekend.
> 
> Regards,
> Ian
> ====================
> 
> 
> 
> 
> At 08:13 AM 2/9/2007, John Caron wrote:
> >Hello all, comments are in-line:
> >
> >Ian Barrodale wrote:
> >>Hi Ted, John, Russ, and John:
> >>Thank you all for taking the time yesterday to both listen to our 
> >>story and to further enlighten us about your work.  It was much 
> appreciated.>>The note below provides a possible implementation 
> route, and some 
> >>questions.  Please feel free to point out any shortcomings in our 
> >>proposed approach, and please provide any answers that come to 
> mind 
> >>regarding our questions.
> >>Thanks again,
> >>Ian
> >>======================
> >>
> >>Goal
> >>-------
> >>Based on feedback from BCS Grid DataBlade customers and, in 
> >>particular, Ted Habermann,  we feel that there may be some value 
> in 
> >>providing alternate ways of  accessing data from a Grid DataBlade 
> >>(GRD) - powered database through existing widely-used protocols 
> and 
> >>methods.  Note that by "accessing", we really mean  just the 
> >>reading part, as we already provide, through the BCS Gridded Data 
> >>Loader client, a means of conveniently ingesting data from many 
> >>forms into a GRD-powered database.  One method of accessing the 
> >>data  would be to cast it in the form of the Common Data Model 
> >>(CDM)  supported by the Java netCDF API from UCAR.  The advantage 
> >>of this is that:
> >>     * users would be able to write software using the Java 
> netCDF API
> >>       (which is fairly straightforward to use and well 
> documented) for
> >>       accessing GRD data, and
> >>     * data providers can use a GRD-powered database and provide 
> access>>       to it through OPeNDAP, WCS, netCDF files, etc. using 
> the Java
> >>       netCDF API (see page 53 attachment, modified from the 
> slide on
> >>       page 53 of
> >>       
> http://www.unidata.ucar.edu/staff/caron/presentations/CDM.ppt).>>Our 
> understanding 
of a possible implementation
> >>------------------------------------------------------------------
> ---
> >>To handle GRD data from the Java netCDF API, we would have to:
> >>(i) Create a GRD I/O service provider for the Java netCDF API 
> (see 
> >>page 38 attachment) that can communicate with the GRD database 
> >>using  a combination of JDBC and the existing Java GRD API.  The 
> >>Java netCDF API uses a service provider architecture to handle 
> >>reading multiple different file formats and casting them in the 
> >>form of the CDM.
> >>(ii) Create a GRD content manager to handle the georeferencing
> >>information in the GRD.
> >>One possible method for allowing users to access GRD data without a
> >>full THREDDS catalog is to supply some type of unique URL to the 
> database:>>   grd://user:pass@server/database
> >>and the service provider would construct a CDM instance that 
> >>contains a main group of all the grids in the database and allow 
> >>the user to access those grids through the API.
> >>For example:
> >>   grd://peter:test123@xxxxxxxxxxxxxxxxxx/coastwatch
> >>might be a reference to a GRD database running at Barrodale that 
> >>contains gridded NOAA CoastWatch satellite-derived data for some 
> >>number of geographic areas and time periods.  The resulting 
> netCDF 
> >>dataset would be one that contains a list of grids under a root 
> >>group like a directory structure:
> >>   /
> >>   /sst/
> >>   /sst/northeast/
> >>   /sst/northeast/jan01_2007    <---- a grid
> >>   /sst/northeast/jan02_2007    <---- another grid
> >>   ...
> >>   /chlorophyll/northeast/jan01_2007   <---- a third grid
> >>   /chlorophyll/northeast/jan02_2007   <---- and so on
> >>It depends on the desired complexity of the grids in the database 
> >>as to whether the user would require a more sophisticated catalog 
> >>with querying ability such as that which THREDDS could supply.
> >
> >see the last answer below.
> >
> >BTW, the TDS will soon have the ability to do proper HTTP-based 
> >authentication, and we are hoping to make that a standard in 
> OPenDAP 
> >clients, which can act like browsers and pop up a 
> username/password 
> >dialog window, instead of embedding the user:pass@ in the URL.
> >
> >>Questions
> >>---------------
> >>We have the following questions:
> >>1) Where in the netCDF API would the content manager that handles 
> >>GRD georeferencing information sit?
> >>2) How does the I/O SP architecture determine the I/O SP for a given
> >>file:// <file://\> style URL?  How would it know to handle a 
> grd:// URL
> >>differently?
> >
> >Very perceptive question; let me start here to explain these 2 
> questions:>
> >The IOSP architecture is, in fact (RandomAccessFile) file based. 
> >Since you will be URL based, we have to fit you in at a higher 
> >level, namely NetcdfDataset.openFile(). If you look there you will 
> >see that we look for opendap (http: or dods:) and thredds: URLs. 
> It 
> >might makes sense to generalize this to allow plugging in external 
> >handlers for your protocol, similar to how java.net.ContentHandler 
> >works. Otherwise we might put your code in the core, which is also 
> a 
> >possibility.
> >
> >Anyway, NetcdfDataset.openFile() would detect your URL scheme and 
> >call NetcdfFile with your IOSP. We will have to add a new 
> >constructor for that. (You could alternately just subclass 
> >NetcdfFile, which is what DODSNetcdfFile does).
> >
> >As for the "content manager that handles GRD georeferencing 
> >information". It could be a CoordSysBuilder subclass. However, 
> this 
> >is actually unnecessary if you use an existing Convention, and we 
> >would highly recommend using the CF Convention for gridded data. 
> >Since you are creating the "file", you can add the attributes and 
> >variables needed by that Convention. This makes your data "CF 
> >compliant" automatically, which is a real win.
> >
> >>3) Have we interpreted the slide on page 53 correctly -- is there 
> a 
> >>server that can serve out data using the CDM (via the Java netCDF 
> >>API) as an intermediate step?
> >
> >yes, the THREDDS Data Server
> >
> >>4) Does a group structure to represent GRD contents map to an 
> >>OPeNDAP connection, WCS, or netCDF file or do those types of data 
> >>representations only have netCDF variables and no groups?
> >
> >In principle you could use Groups, but they really wont be fully 
> >supported until we get the netcdf-4 file format finished and 
> tested. 
> >I would advise to start with the simpler case of no groups.
> >
> >>5) Our understanding of the netCDF Java library is that it has, 
> in 
> >>particular, the following two entry points:
> >>     * NetcdfFile : this is the bare netCDF access to files of 
> various>>       types. It doesn't understand anything about 
> coordinate systems.
> >>       You can add an I/O service provider to handle your 
> favorite file
> >>       format via a class method. The variables it returns are 
> instances>>       of Variable (which of course don't know anything 
> about coordinate
> >>       systems).
> >>     * NetcdfDataset : this is a layer built above the NetcdfFile 
> layer>>       and is the usual interface for applications (e.g., a 
> WCS). It
> >>       handles converting various attributes into a coordinate 
> system. It
> >>       has a number of methods relating to adding or getting 
> coordinate>>       systems. These methods seem to be applied to the 
> entire file,
> >>       rather than to individual variables (or groups).
> >
> >coordinate systems are really variable-specific. however the 
> common 
> >case is that each dataset has a single coordinate system (or a set 
> >of closely related ones).
> >
> >
> >>     CoordinateSystem
> >> 
> >><" target="l">http://www.unidata.ucar.edu/software/netcdf-
> java/v2.2.18/javadoc/ucar/nc2/dataset/CoordinateSystem.html>>>     
> *findCoordinateSystem*>> 
> >><" target="l">http://www.unidata.ucar.edu/software/netcdf-
> java/v2.2.18/javadoc/ucar/nc2/dataset/NetcdfDataset.html#findCoordinateSystem%
28java.lang.String%29>(>>     java.lang.String name)     // Retrieve the 
CoordinateSystem 
> >> with the specified name.
> >>          java.util.List *getCoordinateAxes* 
> >> <" target="l">http://www.unidata.ucar.edu/software/netcdf-
> java/v2.2.18/javadoc/ucar/nc2/dataset/NetcdfDataset.html#getCoordinateAxes%28%29>
() 
> >>
> >>           // Get the list of all CoordinateAxis objects used by 
> >> this dataset.
> >>           java.util.List * getCoordinateTransforms * 
> >> <" target="l">http://www.unidata.ucar.edu/software/netcdf-
> java/v2.2.18/javadoc/ucar/nc2/dataset/NetcdfDataset.html#getCoordinateTransforms%
28%29> 
> >> ()
> >>           // Get the list of all CoordinateTransform objects 
> used 
> >> by this dataset.
> >>           boolean * getCoordSysWereAdded * 
> >> <" target="l">http://www.unidata.ucar.edu/software/netcdf-
> java/v2.2.18/javadoc/ucar/nc2/dataset/NetcdfDataset.html#getCoordSysWereAdded%28%
29> 
> >> ()
> >>           // Has Coordinate System metadata been added.
> >>The NetcdfDataset object contains instances of VariableDS. They 
> are 
> >>like a wrapper for the Variable objects found in the NetcdfFile 
> >>object. There is a method to ask a VariableDS for the list of 
> >>coordinate systems associated with it.
> >
> >exactly
> >
> >>If we interpret things correctly , when a NetcdfDataset object is 
> >>built from a NetcdfFile object, the NetcdfDataset object is 
> >>responsible for figuring out the coordinate system information 
> from 
> >>attributes in the NetcdfFile, and composing a VariableDS from the 
> >>coordinate system information and each Variable. In theory, by 
> >>implementing our own CoordSysBuilder class and registering it, we 
> >>should be able to add coordinate system information to each 
> >>VariableDS individually.
> >
> >yes, or as i mentioned use an existing Convention and 
> CoordSysBuilder.>
> >
> >>A question then is : do applications like the web coverage server 
> >>and OPeNDAP server get their coordinate information from 
> VariableDS 
> >>objects or from the NetcdfDataset object?
> >
> >
> >OPenDAP is (more or less) at the same level as NetcdfFile, and so 
> >just faithfully transmits Variables, Attributes, and Dimensions 
> >across the wire. The coordinate systems then are added by clients 
> >(like CDM) that understand the convention. We are expecting that 
> >DAP4, the future opendap protocol, will add Groups.
> >
> >WCS, OTOH, works at the coordinate system level, and so uses the 
> >GridDatatype, which is specialized for "coverage" data, and gets 
> its 
> >coordinates systems from NetcdfDataset. The clent makes requests 
> in 
> >coordinate space, and we know how to translate that into index 
> >space. Currently we can send back either geoTiff or netcdf/CF 
> files. 
> >There are some limittions- the grid spacing must be uniform in WCS 
> >1.0. We expect to move to WCS 1.1 later this year, which removes 
> >that limitation. We havent implemented reprojection/resampling, 
> and 
> >im not sure that we will.
> >
> >>If it is from the NetcdfDataset object, then the strategy of 
> >>grouping all the grids in a database into a single NetcdfDataset, 
> >>as outline above, won't work, and we'd be obliged to use a 
> THREDDS 
> >>server. Is this correct?
> >
> >It would likely be a mistake to put a lot of disparate data into 
> the 
> >same NetcdfDataset. Better to find the right granularity, which is 
> >typically homogenous data that shares the same discovery 
> >metadata.  So I would not use the Group mechanism to break the 
> data 
> >into granules, better to make seperate datasets. Its possible that 
> >such an idiom will develop with Netcdf-4, but better to get 
> >something working that stays within existing practice, then decide 
> >if you want to forge ahead. Let me emphasize that its really 
> >important to find the right dataset granularity.
> >
> >This means you want to use THREDDS catalogs to publish the dataset 
> >URLs and associated metadata, and possibly use TDS to serve your 
> >data. Once you had an IOSP or equivilent for your data, the main 
> >work is to develop the catalogs. These can be pretty minimal, but 
> >automatically populating catalogs with high-quality metadata is a 
> >huge win in the long run.
> >
> >I think that would be a powerful value-added product, but of 
> course 
> >i dont know what your customers really want. As Ted mentioned, its 
> a 
> >good time to help influence TDS strategy, and it appears to me 
> that 
> >your small company with extensive scientific experience would be a 
> >good fit with Unidata.
> >
> >John
> 
> **********************************************
> Ian Barrodale, Ph.D.
> President
> Barrodale Computing Services Ltd.
> Tel: (250) 472-4372 Fax: (250) 472-4373
> Web: http://www.barrodale.com
> Email: ian@xxxxxxxxxxxxx
> **********************************************
> Mailing Address:
> P.O. Box 3075 STN CSC
> Victoria BC Canada V8W 3W2
> 
> Shipping Address:
> Hut R, McKenzie Avenue
> University of Victoria
> Victoria BC Canada V8W 3W2
> **********************************************
> 
>