Re: [galeon] [WCS-2.0.swg] CF-netCDF standards initiatives

NOTE: The galeon mailing list is no longer active. The list archives are made available for historical reasons.

Hi Folks

blimey, I need to go on holiday more often if discussions like these happen 
when I'm away ... good stuff!

There have been lots of things in this thread.

Anyway, writing a parser for the syntax of netcdf files is harder than writing 
a parser for the syntax of XML files - but neither are hard, and that's not the 
point, and not really what the discussion is about. Writing a parser for the 
semantics of GML cannot be done without understanding the semantics of the 
application schema to which it conforms ... the situation for *some* of the 
semantics of a netcdf file is easier, but when you move to CF it gets harder, 
and CF isn't the end of the game either. IMHO SWE common doesn't make the game 
easier either, it's just another data model ...  albeit one which I have yet to 
understand how to use in a meaningful way in other than very simple cases  
(don't anyone mention Simple Features, it's just another GML application 
schema). I don't think it's worth getting drawn (here) into the merits of the 
various approaches.

The bottom line in this argument is that the difficulty of writing a parser for 
semantics depends on the complexity of the semantics of the data model, not of 
the underlying bucket syntax ... (whether it has data or metadata in it), and 
on whether you can understand those semantics! 

The key thing folks need to remember about interoperability is that it really 
means "I want to do what I've always done with my data ... but with YOUR data 
(and services)" so it's all about devising interfaces. Again IMHO, this can 
only be done if both sides of the interface are well enough defined (aka 
"standardised"  and that's all the word really means: well defined) that you 
can actually do it ...

Folks will know I've blown hot and cold on "standardising" CF-netCDF ... but 
only insofar as I worry that the act of documenting it within the OGC fold 
might break some existing functionality. I'm more than happy to have a body 
which *helps* document existing practice, and provides a methodology for 
*extending* current practice in a well *documented* manner (and provides an 
umbrella for the work, which is often not recognised, if not done in a formal 
process). I also want to ensure that Unidata retains enough flexibilty to move 
NetCDF fast enough for the community ... (and slow enough as well :-) ... which 
means I do think it's right to decouple the management of CF from netcdf ...

Bryan 


On Friday 21 August 2009 21:25:14 Ron Lake wrote:
> How so?
> 
> To write a parser for a binary file one needs to know the detailed structure 
> of the file - and if that changes your code is now broken.  With GML or any 
> XML-based encoding the file structure is not known to your application.  
> Where do you see the problems in parsing and understanding GML?
> 
> R
> 
> -----Original Message-----
> From: Gerry Creager [mailto:gerry.creager@xxxxxxxx] 
> Sent: August 21, 2009 1:17 PM
> To: Ron Lake
> Cc: John Caron; Unidata GALEON; wcs-2.0.swg
> Subject: Re: [galeon] [WCS-2.0.swg] CF-netCDF standards initiatives
> 
> I'm not so sure that's true.
> 
> Ron Lake wrote:
> > Hi John:
> > 
> > Surely the GML encoding is going to be simpler to parse and "understand" 
> > than any equivalent binary encoding.
> > 
> > R
> > 
> > -----Original Message-----
> > From: galeon-bounces@xxxxxxxxxxxxxxxx 
> > [mailto:galeon-bounces@xxxxxxxxxxxxxxxx] On Behalf Of John Caron
> > Sent: August 21, 2009 12:52 PM
> > Cc: Unidata GALEON; wcs-2.0.swg
> > Subject: Re: [galeon] [WCS-2.0.swg] CF-netCDF standards initiatives
> > 
> > A few thoughts on these issues:
> > 
> > The WCS/SWE/OGC are making major contributions in elaborating conceptual 
> > models and defining remote access protocols based on them. More 
> > problematic is managing complexity and adoption paths of existing 
> > communities. One of the main weaknesses of WCS is the likely 
> > proliferation of binary encoding formats. WMS has been lucky to be able 
> > to use existing, universal formats like JPEG, and partly because of 
> > that, has been more widely adopted.
> > 
> > A central issue is the relation of the WCS data encoding format to its 
> > GML envelope. Should the data be "just bits" and let the GML encode all 
> > the semantics, or should a binary format also encode some/all of the 
> > semantics? In the first case, any client must understand GML, which is a 
> > big barrier to use by existing applications, plus the possibility exists 
> > that some essential information is offline or unavailable, as Tom W 
> > points out. In the second case, we have a proliferation of data encoding 
> > formats and the possibility that the semantics are encoded differently 
> > between the GML and the data.
> > 
> >  From Unidata's POV, namely that of supporting a large community of 
> > users of netcdf and other scientific data formats, using netCDF-CF to 
> > encode the data is obvious. However, this is not just a matter of one 
> > more group advocating for their favorite bespoke format. IMHO, netCDF is 
> > close to being a "maximally simple implementation" of a binary data 
> > encoding that allows data to be more than "just numbers". Its BNF 
> > grammer fits on one side of a page of paper. If one chooses to encode 
> > semantics into the data format, and not require a client to know GML, 
> > then you need something like netCDF, and you wont find an encoding 
> > significantly simpler. The CF Conventions group has been slowly adding 
> > the necessary semantics for 10 years or so. From that motivation we 
> > decided to recommend netCDF as a WCS encoding format.
> > 
> > The hard part is describing in a formal way the relationship of the 
> > netCDF-CF binary response to the actual request and to any GML part of 
> > the response. Stefano et al are doing the heroic work of describing the 
> > mapping between the two models, but there is and perhaps should be a 
> > loose coupling between the two. There are always edge cases: what 
> > happens when the requested lat/lon bounding box crosses a discontinuity 
> > of the projection plane? One might get different, but reasonable, files 
> > back from different servers of the same dataset. So what can we 
> > standardize? None of this is unique to netcdf-CF or binary encodings.
> > 
> > Taking the first path of letting GML carry all the semantics has many 
> > advantages also, especially for new development and GIS-centric clients, 
> > and as new libraries are developed that can mitigate some of the 
> > complexity of GML. In (our) scientific community, these are new 
> > possibilities, to be experimented with and prototypes tested. Once there 
> > are real services delivering "gotta have" data we will do whatever we 
> > need to do to "get me some of that" ;^)
> > 
> > John Caron
> > Unidata
> > 
> > Steve Hankin wrote:
> >>
> >> Robin, Alexandre wrote:
> >>> Hi Steve,
> >>>
> >>> Just to clarify when I said NetCDF was a "NEW standard" I meant a new 
> >>> standard in OGC.
> >>>
> >>> As I was telling Ben in an offline email, I am totally conscious of 
> >>> its penetration and usefulness in certain communities.
> >>>
> >>> However, _I am not convinced that /having two standards doing the 
> >>> same thing in OGC /is sending the right message and is the best way 
> >>> to go for a standardization organization_.
> >>>
> >> Hi Robin,
> >>
> >> I confess that I was aware of using a cheap rhetorical device when I 
> >> twisted your intended meaning of "NEW". (Begging your tolerance.) It 
> >> was helpful in order to raise more fundamental questions. You have 
> >> alluded to a key question just above. Is it really best to think of 
> >> the target of OGC as a the development of a single, definitive 
> >> standard? one that is more general and more powerful than all existing 
> >> standards? Or is it better to think of OGC as a process, through which 
> >> the forces of divergence in geospatial IT systems can be weakened 
> >> leading towards convergence over time? The notion that there can be a 
> >> single OGC solution is already patently an illusion. Which one would 
> >> you pick? WFS? WCS? SOS with SWE Common? SOS with its many other XML 
> >> schema? (Lets not even look into the profusion of WFS application 
> >> schema.) I trust that we are not pinning our hopes on a future 
> >> consolidation of all of these. There is little evidence to indicate 
> >> that we can sustain the focus necessary to traverse that path. The 
> >> underlying technology is not standing still.
> >>
> >> What Ben (and David Arctur and others) have proposed through seeking 
> >> to put an OGC stamp of approval on netCDF-CF technology is similar to 
> >> what OGC has achieved through putting its stamp on KML ("There are 
> >> sound business and policy reasons for doing so.") It is to create a 
> >> process -- a technical conversation if you will -- which will lead to 
> >> interoperability pathways that bridge technologies and communities. 
> >> Real-world interoperability.
> >>> There has been a lot of experimentation with SWE technologies as well 
> >>> that you may not know about and in many communities, especially in 
> >>> earth science.
> >>>
> >>> What I'm saying is that perhaps it is worth testing bridging NetCDF 
> >>> to SWE before we go the way of stamping two 100% overlapping 
> >>> standards as OGC compliant.
> >>>
> >> Complete agreement that this sort of testing ought to occur. And 
> >> interest to hear more about what has been achieved. But great 
> >> skepticism that there is this degree of overlap between the 
> >> approaches. And disagreement that this testing ought to be a 
> >> precondition to OGC recognition of a significant ,community-proven 
> >> interoperability mechanism like netCDF. OGC standardization of netCDF 
> >> will provide a forum for testing and experimentation to occur much 
> >> more rapidly and for a 2-way transfer of the best ideas between 
> >> approaches. NetCDF & co. (its API, data model, CF, DAP) have a great 
> >> deal to offer to OGC.
> >>
> >> - Steve
> >>> Regards,
> >>>
> >>> -------------------------------------------------
> >>>
> >>> **Alexandre Robin**
> >>>
> >>> Spot Image, Web and E-Business
> >>>
> >>> Tel: +33 (0)5 62 19 43 62
> >>>
> >>> Fax: +33 (0)5 62 19 43 43
> >>>
> >>> http://www.spotimage.com
> >>>
> >>> Before printing, think about the environment
> >>>
> >>> ------------------------------------------------------------------------
> >>>
> >>> *De :* Steve Hankin [mailto:Steven.C.Hankin@xxxxxxxx]
> >>> *Envoyé :* jeudi 20 août 2009 20:58
> >>> *À :* Tom Whittaker
> >>> *Cc :* Robin, Alexandre; Ben Domenico; Unidata GALEON; wcs-2.0.swg
> >>> *Objet :* Re: [galeon] [WCS-2.0.swg] CF-netCDF standards initiatives
> >>>
> >>> Hi Tom,
> >>>
> >>> I am grateful to you for opening the door to comments "from 10 
> >>> thousand feet" -- fundamental truths that we know from many years of 
> >>> experience, but that we fear may be getting short shrift in 
> >>> discussions of a new technology. I'd like to offer a comment of that 
> >>> sort regarding the interplay of ideas today between Robin ("/I hope 
> >>> we don't have to define a NEW standard .../") and Carl Reed ("/there 
> >>> are other organizations interested in bringing legacy spatial 
> >>> encodings into the OGC. There are sound business and policy reasons 
> >>> for doing so./").
> >>>
> >>> The NEW standard in this discussion is arguably SWE, rather than 
> >>> netCDF. NetCDF has decades of practice behind it; huge bodies of data 
> >>> based upon it; a wide range of applications capable of accessing it 
> >>> (both locally and remotely); and communities that depend vitally upon 
> >>> it. As Ben points out, netCDF also has its own de jure pedigree.
> >>>
> >>> A key peril shared by most IT standards committees -- a lesson that 
> >>> has been learned, forgotten, relearned and forgotten again so many 
> >>> times that it is clearly an issue of basic human behavior -- is that 
> >>> they will try to innovate. Too-common committee behavior is to 
> >>> propose, discuss and document new and intriguing technologies, and 
> >>> then advance those documents through a de jure standards process, 
> >>> despite an insufficient level of testing. The OGC testbed process 
> >>> exists to address this, but we see continually how large the gap is 
> >>> between the testbed process and the pace and complexity of 
> >>> innovations emerging from committees.
> >>>
> >>> Excellent reading on this subject is the essay by Michi Henning, /The 
> >>> Rise and Fall of CORBA/ (2006 -- 
> >>> http://queue.acm.org/detail.cfm?id=1142044). Among the many insights 
> >>> he offers is
> >>>
> >>> **'Standards consortia need iron-clad rules to ensure that they 
> >>> standardize existing best practice.** There is no room for innovation 
> >>> in standards_._ Throwing in "just that extra little feature" 
> >>> inevitably causes unforeseen technical problems, despite the best 
> >>> intentions.'
> >>>
> >>> While it adds weight to an argument to be able to quote from an 
> >>> in-print source, this is a self-evident truth. We need only reflect 
> >>> on the recent history of IT. What we need is to work together to find 
> >>> ways to prevent ourselves from continually forgetting it.
> >>>
> >>> There is little question in my mind that putting an OGC stamp of 
> >>> approval on netCDF is a win-win process -- for the met/ocean/climate 
> >>> community and for the broader geospatial community. It will be a path 
> >>> to greater interoperability in the long run and it deserves to go 
> >>> forward. The merits of SWE (or GML) as an alternative approach to the 
> >>> same functionality also deserve to be explored and tested in 
> >>> situations of realistic complexity. But this exploration should be 
> >>> understood initially as a process of R&D -- a required step before a 
> >>> "standards process" is considered. If that exploration has already 
> >>> been done it should be widely disseminated, discussed and evaluated.
> >>>
> >>> - Steve
> >>>
> >>> ==================================
> >>>
> >>> Tom Whittaker wrote:
> >>>
> >>> I may be ignorant about these issues, so please forgive me if I am
> >>> completely out-of-line....but when I looked at the examples, I got
> >>> very concerned since the metadata needed to interpret the data values
> >>> in the "data files" is apparently not actually in the file, but
> >>> somewhere else.  We've been here before:  One of the single biggest
> >>> mistakes that the meteorological community made in defining a
> >>> distribution format for realtime, streaming data was BUFR -- because
> >>> the "tables" needed to interpret the contents of the files are
> >>> somewhere else....and sometimes, end users cannot find them!
> >>>  
> >>> NetCDF and ncML maintain the essential metadata within the files:
> >>> types, units, coordinates -- and I strongly urge you (or whomever) not
> >>> to make the  "BUFR mistake" again -- put the metadata into the files!
> >>> Do not require the end user to have to have an internet connection to
> >>> simply "read" the data....many people download the files and then
> >>> "take them along" when traveling, for example.
> >>>  
> >>> If I simply downloaded the file at
> >>> <http://schemas.opengis.net/om/1.0.0/examples/weatherObservation.xml>
> >>> I would not be able to read it.  In fact, it looks like even if I also
> >>> got the "metadata" file at:
> >>> <http://schemas.opengis.net/om/1.0.0/examples/weatherRecord1.xml>
> >>> I would still not be able to read it, since it also refers to other
> >>> servers in the universe to obtain essential metadata.
> >>>  
> >>> That is my 2 cents worth....and I hope I am wrong about what I saw in
> >>> the examples....
> >>>  
> >>> tom
> >>>  
> >>>   
> >> ------------------------------------------------------------------------
> >>
> >> _______________________________________________
> >> galeon mailing list
> >> galeon@xxxxxxxxxxxxxxxx
> >> For list information, to unsubscribe, visit: 
> >> http://www.unidata.ucar.edu/mailing_lists/ 
> > 
> > _______________________________________________
> > galeon mailing list
> > galeon@xxxxxxxxxxxxxxxx
> > For list information, to unsubscribe, visit: 
> > http://www.unidata.ucar.edu/mailing_lists/ 
> > 
> > _______________________________________________
> > galeon mailing list
> > galeon@xxxxxxxxxxxxxxxx
> > For list information, to unsubscribe, visit: 
> > http://www.unidata.ucar.edu/mailing_lists/ 
> 



-- 
Bryan Lawrence
Director of Environmental Archival and Associated Research
(NCAS/British Atmospheric Data Centre and NCEO/NERC NEODC)
STFC, Rutherford Appleton Laboratory
Phone +44 1235 445012; Fax ... 5848; 
Web: home.badc.rl.ac.uk/lawrence



  • 2009 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the galeon archives: