I think that the next really important feature of Scientific Data Format/Model design is addding good API support for georeferencing coordinate systems. Arguably, HDF, netCDF, and OpenDAP are the best/widely used general purpose scientific data format/access protocols. All have made some steps towards specifying coordinate systems (netCDF coordinate variables, HDF dimension scales, OpenDAP Grid map variables) but I think none have done a complete job.
Most well-written datasets specify the coordinate systems in the dataset, but without explicit library support, they do so in different ways, often documented by a Convention. We are working closely with the HDF5 developers, who are interested in extending dimensions scales, and we are considering extensions of the netCDF model to comprise netCDF-4. With OpenDAP 4.0 under consideration, it seems like a unique and fortuitous time for all 3 data models to consider new features in ways that are compatible with each other.
I have done some prototyping of the functionality I think we need for coordinate systems in the netcdf-Java library, see the user manual, chapter 3.2 for some of the details, and implemented the prototype in the CoordinateSystem, CoordinateAxis, and CoordinateTransform classes of the ucar.nc2.dataset package (see the javadoc).
To summarize some definitions:
Some of the things we may need that are not currenly expressible by the OPeNDAP Grid datatype:
There is a good argument whether HDF/netCDF/OpenDAP should remain concerned only with the syntax of datasets, and build layers on top to handle the semantics. I think the layered approach is correct, but I think each of those libraries should ship with a standard API for handling semantics. This will then encourage dataset providers and server writers to write the datasets in ways that get correctly interpreted by the semantic handler(s). Further, to make dataset entries in DL/GCMD-type search facilities useful, we need at least spatial and temporal bounding boxes, and viewers need full coordinate systems to correctly visualize the data. This is why I think a standard way of specifying georeferencing coordinates systems is the most importand thing we could add to our Data Models and APIs.