Re: Schedule decision

NOTE: The netcdf-hdf mailing list is no longer active. The list archives are made available for historical reasons.


>     I've been working on revising groups in HDF5 files (to allow for creating
> groups which track creation order, among other things) and its become obvious
> to me that my first attempt at implementing the indices required will not be
> a good long-term solution.  Switching horses midstream could delay getting the
> HDF5 1.8.0 beta release by ~6 weeks if I change the indexing implementation
> right now.  I can, however, continue with the flawed index implementation I
> currently have and build up to most of the API changes that would be required
> and then go back and revise the guts of the library to use a better data
> structure on disk for storing the indices required.
>     This would allow outside applications/libraries (like netCDF-4) to mostly
> stabilize their code on the new API while I went back and reworked internal
> things.  This has several trade-offs that I can think of:
>       A - It gets a [reasonably] stable API to testers somewhat sooner.
>       B - Its going to take longer, because I'll have to re-do some work.
>       C - Files created during the "transition period" will _never_ be able to
>             be read by any other version of the HDF5 library - they must be
>             discarded by testers.  
>     If we've got enough flexibility in our schedules, I would prefer to avoid
> doing the re-work and just get things right first.  But, since there is an
> alternate plan that could work, I thought I would raise the issue.
>     What does everyone think?

We think you should "switch horses in midstream" (being careful not to
slip into the current :-) and implement the index using the better
data structure you've discovered.

Speaking of horses and mangled metaphors, allow me to try beating a
dead horse to see if it bears fruit :-).

If you are reconsidering the implementation of creation order
tracking, we would also suggest reconsidering whether timestamps are
the right way to store information about order of creation.  It seems
entirely plausible that two Datasets could be created in the same
Group within a very short time interval, get the same time stamp, and
then information about their creation order would be lost.  A simple
sequence number that is incremented for each object would preserve the
creation order no matter how fast creation occurs, and would represent
all the information netCDF-4 needs.


  • 2005 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdf-hdf archives: