I guess we're thinking along the same lines, because after reading Jon's email my first thought was I would return a string like "F-TDS - OPeNDAP - netCDF". This would say a lot to me, in that I would know that the data flowing out might be the result of some sort of server side calculation, delivered via OPeNDAP and originally read from a netCDF file. But, without some sort of agreed upon convention then it's just a String.

I think loading up these ID's with a bunch of semantics seems kind of like a bad idea. An invitation for trouble and confusion.

Don't we really want an API where we can ask the relevant questions like Are you compressed? Are you coming over the wire as OPeNDAP? What is your underlying storage?


On 04/29/2011 11:20 AM, Ethan Davis wrote:
Hi Roland,

The fileTypeId wasn't really designed with virtual datasets in mind or
with any particular semantics in mind. They are similar to using
software version numbers to indicate capabilities, you just have to know
what they mean.

Since virtual datasets may change the characteristics expected from a
dataset with a given fileTypeId, perhaps we should extend fileTypeIds to
allow for multi-layer names. Maybe something like "ncAggregation - GRIB2".


On 4/29/2011 7:41 AM, Roland Schweitzer wrote:
Hi Jon,

Thanks for the code link and information.  Unfortunately I'm still

It seems that  you want to distinguish between reading local data files
and OPeNDAP data sources and distinguish between uncompressed and
compress local files.  Makes sense to want to know this and to code
accordingly.   However, Ethan said that TDS promotes the underlying
FileTypeId from the data files when returning the FileTypeId for a TDS
aggregation. And you suggest I do the same for my virtual data sets.  It
seems to me that this will give you exactly the wrong information for
how you want to classify the data.  An F-TDS data source is by
definition an OPeNDAP data source, but the data type of the underlying
data will most of the time be a local netCDF file or type netCDF.

So if I follow the suggestion and return won't you get wrong
optimization by looking the FileTypeId?


On 04/29/2011 03:45 AM, Jon Blower wrote:
Hi Roland,

The fileTypeId is used by ncWMS to decide on what algorithm to use to
extract data.  Compressed data (e.g. NetCDF4) and data read over
OPeNDAP, have very different performance characteristics to
uncompressed, local data (e.g. NetCDF3, HDF4).

See the code here:

So I guess the fileTypeId of your virtual dataset should match the
underlying file type.  If this isn't easy, then from the WMS point of
view you can put any old string as the fileTypeId and the WMS will be
conservative and won't assume that data-reading is "cheap".

This is an example of the adage "all abstractions are leaky"...
performance concerns are notorious for messing up nice clean



Jon and Ethan,

Help me understand the best way forward with these FileTypeId in the
IOSP.  Questions below...

On 04/15/2011 03:37 PM, Ethan Davis wrote:
Hi Roland,

What should these [getFileTypeId] values be, Ethan?  Is there an
official enumeration I can reference for know values for these?
I just grabbed them off the Web page:

It makes sense to me to use netCDF since it is the intent of the IOSP
to act like netCDF OPeNDAP in every case.
The ID values should uniquely identify the "file type". The web page
enumerates the values we know. We encourage everyone that implements
an IOSP to select an ID not on the list and let us know so we can
update the list.
So, I think rather than use "netCDF" you should decide on an ID unique
to your IOSP. Or, if all the datasets behind one of your virtual
datasets are always going to be the same type, you could use the type
of the backing datasets. (That is what the CDM Aggregation class does,
it uses the "file type" of the aggregations "typical dataset").
Jon,  what does the ncWMS do with the FileTypeId?   The best decision
from my point of view for what to return seems to depend on how the
value is being used by clients.  I thought the point of the the CDM
was everything looks like netCDF.  If folks are making optimizations
based on the FileTypeId then for some cases like F-TDS it seems like
they might miss out if the FileTypeId was something other than netCDF.

Let us know if you decide on a new unique ID and we'll add it to the
list .

