Re: [netcdf-hdf] collective I/O ?

Hi Rob,

Perhaps you can share your testing program,machine and compiler information with both Ed and me. I may steal some time to reproduce here.

Thanks,

Kent
At 04:32 PM 9/18/2007, Ed Hartnett wrote:
MuQun Yang <ymuqun@xxxxxxxxxxxx> writes:

> Hi Rob,
>
> I've done some parallel NetCDF4 performance tests with ROMS model
> almost a year ago with NetCDF4 alpha release. At that time, I am
> pretty sure NetCDF-4 can successfully go  to collective IO calls
> (MPI_File_write_all with set_file_view) in HDF5 layer. But I
> remembered there are some parameters being set wrong inside NetCDF-4
> that I have to change so that collective IO calls can be passed to
> HDF5. I think Ed may have already fixed that but I may be wrong.

The issue that Kent found was related to the size of unlimited
dimensions. I was storing this information in a special
attribute. But, as Kent pointed out, that meant that every time the
data were appended along that dimension, the attribute had to change,
and that was not good.

That has been fixed (for many moons) so that the attribute is no
longer involved when a record is written to the file.

> Another possibility is that HDF5 "magically" figure out your case is
> not good or not possible for collective IO and it will change the
> route to do an independent IO call instead. To verify that, we have
> to get your program, the platform, mpi-io compiler information.

>>
>>I can verify in Jumpshot that all (in this case) 4 processors are
>>calling independent I/O routines (MPI_File_write_at), even though I
>>was expecting to see collective I/O (MPI_File_write_all or
>>MPI_File_write_at_all).

It would be best if we can reproduce this, but at the moment I don't
quite know how to tell if independent or collective I/O is actually
being used. What is Jumpshot - your debugger?

Thanks!

Ed

--
Ed Hartnett  -- ed@xxxxxxxxxxxxxxxx