[netcdf-hdf] Fortran parallel IO in netcdf-beta2/hdf5
Edward (Ted) Mansell
ted.mansell at noaa.gov
Fri May 30 13:00:08 MDT 2008
Howdy,
I've been trying to use parallel IO through the Fortran90 interface with some success. Since there was no fortran test program, I created one based on tst_parallel.c and tst_parallel3.c, and is available here:
<http://cimms.ou.edu/~mansell/f90tst4.F90>
It tests MPIIO and MPIPOSIX with independent and collective IO (i.e., 4 combinations) and timings for each combination. It has been tested on 4 systems so far. 2 shared memory machines: Intel Mac Pro (OS 10.5.2/ifort 10.1/gcc/mpich2); SGI Altix (BX2?/SUSE linux/ifort 10.1/icc); and 2 clusters (one with Itanium2 processors, one with Xeon, both with mvapich and Intel9 compilers over infiniband).
In general, COLLECTIVE with MPIPOSIX has given the best performance, but on the itanium2 cluster is still much slower than having each processor write separately (open/write/close) in round-robin fashion to a netcdf3 or netcdf4 file (at least for a small number of processors), but I still need to test how it scales. The MPIPOSIX tests failed when I increased from 3 to 6 processors (2 processors did not write their data for the second time level). I tried adding an MPI_BARRIER after each time the file is closed, but that did not seem to help.
On the Mac I get errors from COLLECTIVE with MPIIO:
[cli_1]: aborting job:
Fatal error in MPI_Type_free: Invalid datatype, error stack:
MPI_Type_free(145): MPI_Type_free(datatype_p=0x100840a70) failed
MPI_Type_free(96).: Cannot free permanent data type
HDF5: infinite loop closing library
R,D,G,A,S,T,D,G,S,F,D,G,A,S,T,F,AC,FD,P,FD,P,FD,P,E,E,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL
The HDF5 test program "testphdf5" does not generate any obvious errors, however, and I think it is testing all 4 file IO combinations also. Also, tst_parallel3 runs without errors. So I wonder if there could be an issue netcdf4 fortran interface? (I've seen this error in earlier versions of hdf5v1.8 with various netcdf4 snapshots).
If anybody finds problems with the test program, please let me know!
-- Ted
--
__________________________________________________________
| Edward Mansell
| NOAA/National Severe Storms Laboratory
| Norman, OK 73072
|----------------------------------------------------------------------------
|
| "The contents of this message are mine personally and
| do not necessarily reflect any position of the U.S. Government or NOAA."
|
|----------------------------------------------------------------------------
More information about the netcdf-hdf
mailing list