Re: [galeon] [WCS-2.0.swg] CF-netCDF standards initiatives

NOTE: The galeon mailing list is no longer active. The list archives are made available for historical reasons.

  • To: Tom Whittaker <whittaker@xxxxxxxx>
  • Subject: Re: [galeon] [WCS-2.0.swg] CF-netCDF standards initiatives
  • From: John Graybeal <graybeal@xxxxxxxxx>
  • Date: Fri, 21 Aug 2009 08:27:48 -0700
I'm in the odd position of agreeing in principal with several writers (keep metadata with data, support non-networked computing, the values are more than the numbers), and then disagreeing with many details. A few examples are below.

On reading Steve Hankin's post, though, I must ask: What exactly is being proposed? A binary data format for files? A set of such binary data formats? Or a protocol for exchanging information? Is this simply a recapture of 'everything netCDF and CF' so that OGC can put a stamp of approval on it?

Ben wrote "This approach will result in a binary encoding which can be used with different access protocols, e.g., WFS or SOS as well as WCS." I don't really know what it means to 'use a binary encoding with SOS', can we be more precise about that?

In short, having read through the referenced 'core standard' proposal [1], I can't tell what we're trying to do yet..

Other comments on this thread, for those needing distraction:

On Aug 20, 2009, at 10:00 AM, Ron Lake wrote:

I would argue that we should stop this idea that data are just numbers and strings and everything else is "metadata". <snip> Let's start by defining the objects of interest and THEN we can have metadata about them.

After watching thoughtful communities try to carefully describe 'the object of interest', I am sure the proposed 'start' will be a long slow one. I'd rather stick with "one person's data is another person's metadata", and try to avoid getting too excited about the precise distinction between data and metadata, except when it is very narrowly defined on a specific project (not the case in this thread, IMHO).

On Aug 20, 2009, at 9:54 AM, Tom Whittaker wrote:

One of the single biggest mistakes that the meteorological community made in defining a distribution format for realtime, streaming data was BUFR -- because the "tables" needed to interpret the contents of the files are somewhere else....and sometimes, end users cannot find them!

Perhaps this is a problem with the way the tables are made available, and not simply the fact they are separate from the data stream? After all, many image files (for example) are not described internally at all, but no one seems to have trouble working with those images.... (I know that's oversimplifying the difference, but it's instructive nonetheless.)

NetCDF and ncML maintain the essential metadata within the files:
types, units, coordinates -- and I strongly urge you (or whomever) not
to make the  "BUFR mistake" again -- put the metadata into the files!

Maybe you think all the essential metadata is within the netCDF file, but in my opinion it isn't. I often find the essential metadata, particularly of the semantic variety, to be absent. And I know of communities that have had significant difficulty with the provenance (for example) within CF/netCDF files.

The generalization (point) of this observation is that different people require different metadata, sometime arbitrarily complex or peripheral metadata. And I don't think you want ALL that metadata in the same file as the data -- especially when the data may be coming not in a file, but in a stream of records.

Do not require the end user to have to have an internet connection to simply "read" the data.... many people download the files and then take them along" when traveling, for example.

Ah, in the era of linked data, or LinkedData [2] -- which will be our era in 5 years from now, if not already -- this problem will be solved, because all will insist on having the internet connection when they are traveling. Witness the trajectory of internet availability at scientific conferences.

If I simply downloaded the file at
<http://schemas.opengis.net/om/1.0.0/examples/weatherObservation.xml>
I would not be able to read it. In fact, it looks like even if I also got the "metadata" file at:
<http://schemas.opengis.net/om/1.0.0/examples/weatherRecord1.xml>
I would still not be able to read it, since it also refers to other servers in the universe to obtain essential metadata.

Uh... I think you may be a bit wrong about what you saw in the examples. The first file is crudely readable if not comprehensively described (to say the least), but by the designer's choice this file references more detailed metadata in a second file. (The file creator didn't have to do that per the spec, but in some observing systems I would say it makes sense.) Nothing in the second file appears to refer to 'essential metadata' in other files... depending on what you think of as essential of course. (The .xsd for example is more of a format specification, not a bit of central metadata. By analogy, I can't find the reference in a netCDF file to any specification of its format, so I guess it wouldn't qualify as containing all the essential metadata in that sense either.)

John

[1] Core standard OGC draft: 
http://sites.google.com/site/galeonteam/Home/cf-netcdf-candidate-standard
[2] Linked Data: linkeddata.org



--------------
NOTE NEW EMAIL ADDRESS
--------------
John Graybeal   <mailto:jbgraybeal@xxxxxxxxxxxxxx>
Marine Metadata Interoperability Project: http://marinemetadata.org



  • 2009 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the galeon archives: