netCDF Attribute Convention for Dataset Discovery - Issues and ToDo List

Ethan Davis

last updated 26 September 2005


To Do List



Issues



Comparing Proposed Attributes to CF Attributes


CF Attribute Proposed Attribute Discussion
comment
comment (add to proposal?)
This seems to be a very general slot for comments on the data, the project, the processing. I'm not sure how this would fit into the data discovery arena. Could just be used as text to feed into a free text search (extension to summary?). I.e., how would this be mapped into THREDDS metadata (maybe documentation).
summary
More general than a summary.

acknowledgement
I don't see a place for this in CF. Maybe comment.
history
history
Pretty direct mapping from CF to this proposal. How compare with processing_level?
institution
creator_name
creator_url
creator_email
Good semantic mapping to/from CF (in proposal, creator can be individual or institution). However, the more structured nature of the creator_* attributes might cause problems with an actual mapping to/from the more free text nature of the institution attribute.
contributor_name
contributor_role
Another possible mapping (contributor can also be individual or institution).
id
naming_authority
Not good match. The id/naming_authority pair is intended to provide a "globally" unique ID for a dataset; doesn't have to be related to creation of dataset.
project
Kind of one level above creator/institution. More of a "why was this dataset created" rather than "where was it created".
summary
The summary is intended as a human readable description of the dataset that can be used in free text searches. Should probably contain creator/institution information.
references
references (add to proposal?)
Certainly good information to have but I'm not sure how this would be used in data discovery.
source
source (add to proposal?)
Seems like this would be a good addition to the proposed attributes. This information should probably also be in the summary attribute.
processing_level
As Jonathan said, this is a bit vague. However, some places have specific processing level terminology. Do we want to allow for specifyingcontrolled lists of values?
project
I think project fits better in the creator/institution area than source.
summary
The summary is intended as a human readable description of the dataset that can be used in free text searches. Should probably contain source information.
standard_name
standard_name
Direct mapping between CF and this proposal. The only change is to allow use of non-CF standard name values (which should only be done if the CF convention is not being followed). This is done by indicating in the standard_name_vocabulary attribute the name of the variable name controlled vocabulary that is being used.

???: For a CF file, values must be from CF standard name table. Do we want to allow CF compliant files to have alternate "standard names"? If so, need to not use "standard_name".
title
title
Direct mapping between CF and this proposal.

time_coverage_*
geospatial_*
Some points from Jonathan: 1) can deduce info from coordinate variables; 2) need to be rewritten if subselection is made.

We do need some way to bubble this information up to tools that harvest dataset discovery information that won't be CF aware (some digital libraries won't even be all that data aware). We're also looking (in THREDDS) at containing this info at the catalog level. So, maybe that is a better solution.