Re: [netcdfgroup] NetCDF external links

Dear NetCDF-Group,

about half a year ago we discussed the integration of external links in NetCDF.
Motivation:
In our institution, people are already working with multiple data files (grid and data separated) to avoid replication of the grid when a file only contains one timestep.

Here is a short summary of last discussion:
1. Our implementation of external links is based on HDF5 Virtual Datasets (VDS). It allows to use a variable defined in another file as one of the dimensions.
2. Possible application fields are data deduplication and I/O optimization.
- When data and grid are stored in separate files, grid can be reused. No duplication of the grid is necessary. - I/O optimization is achieved, through saving of storage space and network bandwidth. 3. Until now, there was an implicit assumption, that NetCDF files must be self-contained, i.e., all data must be stored in one single file. 4. This feature is not mandatory nor does it change anything inside the regular NetCDF4 file format. It can be used when necessary.
5. Storage of data in multiple files has been discussed:
- What happens if one file is missing?
The conclussion was, that the file is still valid, because in that case the default values will be used, but the data file is useless for the application, because the data can not be interpreted.
- Are all files (data and grid files) valid NetCDF4 files?
The files using links are not backwards compatible.
6. We believe the single file semantic must go away in the long term, where this approach is an intermediate step.

We would like to see this feature to be added to NetCDF standard.
We can provide a patch for configure to include support only when the required HDF5 version is available. Is there anything else necessary to help in integrating this feature into NetCDF:
- Do we need better understanding of saving data in multiple files?
- Shall we provide a well tested and documented implementation?
- How large must the number of intrested people be, in order to justify the integration this feature?

You find a patch on our website:
http://wr.informatik.uni-hamburg.de/research/projects/bullio/netcdf_external_links/start

We would like to reopen the discussion.
Please provide a clear rejection, if for some reason this feature can't never be a part of NetCDF.

Regards,
Eugen



  • 2017 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: