Re: Updated group API changes draft

Hi Ed,

> "Robert E. McGrath" <mcgrath@xxxxxxxxxxxxx> writes:
> 
> >  From talking with Quincey, I would like to point out a couple of areas
> > where people might want to look carefully. 1. H5Gmove, etc.
> >
> > Technically, time stamps are set when an object is inserted in a group.
> > And technically, H5Gmove is an atomic (delete, then insert).  So it
> > should
> > set the time stamp to the time of the move.
> >
> > Quincey proposes to not update the time.
> >
> > There are arguments for either behavior, so it would be good for
> > people to
> > look at this carefully.

    What I am proposing, essentially, is that the H5Gmove2() operation changes
only the "name" field of the link (potentially moving the entire link to
another group in the process) and no other information in the link record.
(More details on the "link record", etc. below)

> I will leave this to you HDF5 programmers. NetCDF doesn't allow moves,
> so I don't use H5Gmove.

    Ok, cool.

> > 2. H5Gset_creation_time
> >
> > This is a funny function that let's you manually set the timestamp on
> > a given link.  I'm not sure we need this at this time.
> >
> > As far as I know, we don't have any use cases for this feature.
> >
> > To me, it is philosophically questionable, because the timestamp isn't
> > supposed to be
> > "whatever I want it to be", it is supposed to be "the time I really
> > did it".
> >
> > Also, I can think of at least 2 pernicious uses of this feature:
> >
> > a) somebody my use this timestamp as a back door way to create an
> > arbitrary
> > numeric index.  E.g., compute some statistic in the dataset, and set the
> > timestamp to that value.  Voila! I have an index on that statistic!
> >
> > b) somebody might do something horrid like store every record in a
> > separate
> > dataset, setting the time step from some source such as a real time
> > clock
> > or network time stamp. They are trying to get time sorted records (think
> > of Boeing, merging multiple real time streams.)  This would kind of
> > work,
> > but I think it would be a really bad use of HDF5.
> >
> > Again, others may well have different opinions, so it would be good for
> > people to look at this one.
> 
> I think you should store an index, not a time-stamp. Do not let the
> user change the index under any circumstances.

    What we are doing in groups after these revisions (at a higher level of
abstraction) is creating a table of "link" records for the group, with three
(currently) fields: name, creation time, and offset of object header (i.e.
"OID").  Then, the user is allowed to create indices on these fields (and any
others that we create later) and look up links (i.e. objects) in an index with
either a field value or the offset in the index.

    Since these are just "fields" in a record about the link and we allow the
user to change the name field already (which changes the order that the link
will be found according to the "name index"), I don't see any reason not to
allow users to change the "creation time" field also.  Yes, it means that the
field may no longer actually represent the "real" creation time in the group,
but maybe the user is trying to order the objects according to their actual
order of creation before their insertion into the HDF5 group.  As Bob mentions,
this will allow very clever users to twist things around if they'd like, but
they can already do that by changing link names.  Generally, giving users more
power and flexibility has been the model we've tried to provide with HDF5.

    Additionally, storing the creation time and indexing on it allows for the
possibility of creating a "global" creation time index of objects in more than
one group, because the objects have an absolute measure of their time of
creation instead of a relative one, per group.

    Those are generally my reasons why I'm suggesting to store creation time
and index on it.  I do understand the reasoning for storing a creation order
field and indexing on it also, but I think that users will want to ability to
re-order things according to that ordering also and it would be better to
go with the creation time from the beginning.

    Quincey