Showing entries tagged [hdf5]

HDF5 Dimension Scales - Part 3

The gritty details of what dimension scales look like at the file object level.

[Read More]

HDF5 Dimension Scales - Part 2

In which we travel to alternate universes and cavort with dolphins.

[Read More]

HDF5 Dimension Scales

You really have to use the rich, chocolaty netCDF-4 instead of vanilla HDF5 for earth science data. Here's why.

[Read More]

netCDF Identifiers and Character Escape Mechanisms (sigh!)

netCDF Identifiers and Character Escape Mechanisms (sigh!)

Ideally, netCDF should allow any printable UTF-8 character to be used in an identifier. Currently, that is almost the case, with forward slash being the exception because of the syntax of HDF5 identifiers.

More and more, the netCDF API is being used as wrapper for a wide variety of other formats: HD5, HDF4, GRIB, BUFR, DAP2, DAP4, etc. During the process of defining translations to/from netCDF and these other format, it is necessary to implicitly or explicitly define netCDF identifiers from the schemas of these other formats.

The canonical example is HDF5. In HDF5, many API functions take a path, which is a sequence of identifiers separated by '/'. A path may be absolute ("/g1/g2/x") or relative ("y"). It appears to be the case that there is no way in HDF5 to specify an identifier containing '/', such cases are always interpreted as paths. So, if one naively defined, thru the netcdf-4 API, a variable named "/x/y", there is no apparent way to actually get this defined properly in HDF5. It is this fact that has led to the current, IMO undesirable, restriction that netCDF identifiers may not contain '/'.

Super Escapes

This situation is going to recur as the netcdf API is used to wrap other data formats. What we will need is a mechanism by which we can convert an identifer containing arbitrary UTF-8 characters into another identifier in some rather restricted set of legal identifier characters. In addition, I would impose the rule that the conversion is invertible.

This kind of "super-escaping" is very hard because in the worst case, we are likely to encounter the situation where legal identifier characters are restricted to something like the alphanumerics plus underscore.

Data Format Summit Meeting

Last week, on Wednesday, the Unidata netCDF team spent the day with Quincey and Larry of the HDF5 team. This was great because we usually don't get to spend this much time with Quincey, and we worked out a lot of issues relating to netCDF/HDF5 interoperability.

I came away with the following action items:

  • switch to WEAK file close
  • enable write access for HDF5 files without creation ordering
  • deferred metadata read
  • show multi-dimensional atts as 1D, like Java
  • ignore reference types
  • try to allow attributes on user defined types
  • forget about stored property lists
  • throw away extra links to groups and objects (like Java does)
  • work with Kent/Elena on docs for NASA/GIP
  • hdf4 netCDF v2 API writes as well as reads HDF4. How should this be handled?
  • John suggests not using EOS libraries but just recoding that functionality.
  • HDF5 team will release tool for those in big-endian wasteland. It will rewrite the file.
  • should store software version in netcdf-4 file somewhere in hidden att.
  • use HDF5 function to find file type, this supports user block
  • read gip article
  • update netCDF wikipedia page with format compatibility info
  • data models document for GIP?

I have been assured that this blog is write-only, so I don't have to explain any of he above, because no one is reading this! ;-)

The tasks above, when complete, with together add up to a lot more interoperability between netCDF-4 and existing HDF5 data files, allowing netCDF tools to be used on HDF5 files.

Unidata Developer's Blog
A weblog about software development by Unidata developers*
Unidata Developer's Blog
A weblog about software development by Unidata developers*



News@Unidata blog

Recent Entries:
Take a poll!

What if we had an ongoing user poll in here?

Browse By Topic
Browse by Topic
« May 2019