Re: question about HDF5 parallel use of unlimited dimensions...

> "Robert E. McGrath" <mcgrath@xxxxxxxxxxxxx> writes:
> > On Thu, 30 Jun 2005, Ed Hartnett wrote:
> >
> >> As I understand it, using fixed-length data, my different processes
> >> could write their data and go on their merry way, without waiting for
> >> anything.
> >
> > This can be done only if you know that the different processors are
> > not updating the same chunk.
> Surely this is easy to arrange.

    Actually, the HDF5 library handles the case where the processors are
updating the same chunk correctly, even for independent writes.

> > Fixed vs extended is not relevant to the basic issue: you must
> > coordinate all writes in parallel.
> But aren't writes that don't call extend independent? Doesn't that
> mean processes writing to them don't have to coordinate?
> Or did you mean they have to coordinate in that they each have to know
> what chunks they can write, but they don't have to wait for the other
> processes to do their writing.

    They do have to coordinate in the sense that the processes shouldn't
expect any ordering to their independent writes (which is only a problem if
they overlap the elements they are writing).  Otherwise, as I said above, it
is possible for two processes to write to the same (non-compressed) chunk
independently or collectively.

> >> As currently implemented, H5Dextend is called when needed as you write
> >> the data in netcdf. That is, if you are writing a record at a time,
> >> H5Dextend is called for each record.
> >
> > This will be quite slow in parallel. But it will work.
> >
> > Presumably, user's can control this by batching the writes.
> >
> Well, if we batched the extends, that would help, right?
> For example, instead of extending it one record at a time, I could
> extend it 10 records at a time, and then for 9 out of 10 writes, I
> wouldn't have to call extend.
> Or am I barking up the wrong tree?

    No, that's a reasonable thing to try.  It will require some tuning and
keeping of state in the netCDF4 layer though.


> Thanks!
> Ed
> -- 
> Ed Hartnett  -- ed@xxxxxxxxxxxxxxxx