[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: NetCDF CF "Profile" Conventions for NetCDF



On 11/29/2010 7:04 AM, Reiner Schlitzer wrote:

Hi Rich:

 

Thanks for involving us in the discussion.

 

Presently ODV reads and writes multiple profile data from/to files using your "multidimensional representation" (9.5.1). I assume that we can also deal with the single profile version, but have'nt verified this. Should you decide to also recommend the ragged array representations 9.5.3 and 9.5.4 we will support these as well.

 

In order to facilitate detection of the particular representation used in a netcdf file I recommend adding a global attribute "featureRepresentation" similar to the "featureType" attribute.

 

I understand that the "multidimensional representation" is wasting storage space if stations have very different numbers of observations, but thought that netcdf 4 would solve this problem. Could you please explain why your specifications are still targeted towards the "classic netcdf model" as stated on page 1 and is not building on netcdf 4.

 

Concerning the ragged array representations I wonder whether these forms allow adding stations without having to re-write the entire file. Extensibility always seemed to be an important property, but having the value of "profiles" fixed seems to preclude this.

 

Another question: are the names of dimensions and meta variables standardized, and if so, could you please point me to a table with these names.

 

Best wishes,

Reiner

 


Hi Reiner:

The CF "discrete sampling" aka "point" spec has a few changes (unfortunately) since the draft you have seen. Hopefully we can say definitively what those are very soon.

I agree that adding the global attribute "featureRepresentation" would make things easier to understand. I will suggest that to the committee.

By "netcdf4 solving the problem", I assume that you mean that compression would make this option less wasteful of disk space? That is true (if you do it right), and users may use that if they want without interfering with CF compliance.

Generally CF, is still oriented towards netcdf-3; we havent started to create alternatives for the netcdf-4 extended model. This is because most of CF effort is in the modeling community, and in particular supporting the IPCC AR5.  By staying with the "classic model",  one can use netcdf-4 compression and tiling without making any changes to software. Many, probably most people are adopting this for now.

In non-grid data types, eg "
discrete sampling", theres probably a lot more to be gained by using the extended data model, esp structures and multiple unlimited dimensions. This is definitely worth exploring. The CF process is rather laborious however, so probably it should be explored and "best practice" discovered before proposing.

With regard to adding stations to ragged arrays, the best one can do is preallocate a maximum number of stations, due to the limitation of a single unlimited dimension in netcdf-3. This is a good reason why the extended model should be used in this case. Its a little obscure probably, but the proposal allows this, see 9.3 introduction: "The station_id variable may use missing values. This allows one to reserve more space than is needed for stations."

The actual names of dimensions and variables are not standardized, a long-standing CF "rule".

I have it on my TODO list to look over the ODV and see how we can complement your work. Hopefully, Ill have more to say about that sometime soon.

Regards,
John