[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #ZXI-494839]: netcdf-4.0.1 OpenMPI-1.3.2 FORTRAN Parallel IO bug fix [SEC=UNCLASSIFIED]



> Hi netCDF developers,
> 
> Please find attached a suggested bug fix for the FORTRAN parallel
> interface to netcdf-4.0.1. The bug arises when using OpenMPI (v1.3.2)
> as the MPI library.
> 
> The FORTRAN parallel read/write netcdf4 example:
> netcdf-4.0.1/nf_test/f90tst_parallel.f90
> fails when using netcdf-4.0.1 and OpenMPI-1.3.2 with the error message:
> 
> *** An error occurred in MPI_Comm_dup
> *** on communicator MPI_COMM_WORLD
> *** MPI_ERR_COMM: invalid communicator
> 
> This is due to passing an MPI_Comm type and MPI_Info type between
> FORTRAN and C without calling the MPI_Comm_f2c() and MPI_Info_f2c()
> functions (see
> http://www.mpi-forum.org/docs/mpi21-report-bw/node355.htm#Node355 for
> more info).
> 
> My suggested fix is to create new C wrapper functions
> (nc_create_par_fortran() and nc_open_par_fortran()) that are only
> called via the FORTRAN interface:
> 
> In libsrc4/netcdf.h:
> 370 EXTERNL int
> 371 nc_create_par_fortran(const char *path, int cmode, MPI_Comm comm,
> 372           MPI_Info info, int *ncidp);
> ...
> 378 EXTERNL int
> 379 nc_open_par_fortran(const char *path, int mode, MPI_Comm comm,
> 380    MPI_Info nfo, int *ncidp);
> 
> In fortran/fort-nc4.c:
> 34 FCALLSCFUN5(NF_INT, nc_create_par_fortran, NF_CREATE_PAR, nf_create_par,
> 35         STRING, FINT2CINT, FINT2CINT, FINT2CINT, PCINT2FINT)
> 36
> 37 FCALLSCFUN5(NF_INT, nc_open_par_fortran, NF_OPEN_PAR, nf_open_par,
> 38         STRING, FINT2CINT, FINT2CINT, FINT2CINT, PCINT2FINT)
> 
> 
> In libsrc4/nc4file.c:
> 380 /*#ifdef USE_PARALLEL*/
> 381 int
> 382 nc_create_par_fortran(const char *path, int cmode, MPI_Comm comm,
> 383           MPI_Info info, int *ncidp)
> 384 {
> 385   MPI_Comm  local_comm;
> 386   MPI_Info  local_info;
> 387
> 388   /* map fortran MPI_comm to c MPI_comm */
> 389   local_comm = MPI_Comm_f2c(comm);
> 390
> 391   /* map FORTRAN MPI_Infor to c MPI_Info */
> 392   local_info = MPI_Info_f2c(info);
> 393
> 394    /* Only netcdf-4 files can be parallel. */
> 395    if (!cmode & NC_NETCDF4)
> 396       return NC_ENOTNC4;
> 397
> 398    /* Must use either MPIIO or MPIPOSIX. Default to the former. */
> 399    if (!(cmode & NC_MPIIO || cmode & NC_MPIPOSIX))
> 400       cmode |= NC_MPIIO;
> 401
> 402    return nc_create_file(path, cmode, 0, 0, NULL, local_comm, local_info, 
> ncidp);
> 403 }
> 404 /*#endif*/ /* USE_PARALLEL */
> 
> ...
> 
> 642 /*#ifdef USE_PARALLEL*/
> 643 int
> 644 nc_open_par_fortran(const char *path, int mode, MPI_Comm comm,
> 645         MPI_Info info, int *ncidp)
> 646 {
> 647   MPI_Comm  local_comm;
> 648   MPI_Info  local_info;
> 649
> 650   /* map fortran MPI_comm to c MPI_comm */
> 651   local_comm = MPI_Comm_f2c(comm);
> 652
> 653   /* map FORTRAN MPI_Infor to c MPI_Info */
> 654   local_info = MPI_Info_f2c(info);
> 655
> 656    /* Only netcdf-4 files can be parallel. */
> 657    if (!mode & NC_NETCDF4)
> 658       return NC_ENOTNC4;
> 659
> 660    /* Must use either MPIIO or MPIPOSIX. Default to the former. */
> 661    if (!(mode & NC_MPIIO || mode & NC_MPIPOSIX))
> 662       mode |= NC_MPIIO;
> 663
> 664    return nc_open_file(path, mode, 0, NULL, 1, local_comm, local_info, 
> ncidp);
> 665 }
> 666 /*#endif*/ /* USE_PARALLEL */
> 
> 
> In the above 2 functions nc_create_par_fortran() and nc_open_par_fortran() I 
> have called MPI_Comm_f2c() and MPI_Info_f2c() to map the FORTRAN communicator 
> and info to the C type. After applying this fix to the netcdf-4.0.1 source 
> the FORTRAN parallel test passes successfully.
> 
> There could be other FORTRAN functions that will pass MPI_Comm and MPI_Info 
> types to C functions (nf_open() for example) so this bug fix will need to be 
> applied in those functions as well.
> 
> Cheers,
> 
> j
> 
> 
> Dr Justin Freeman
> Centre for Australian Weather and Climate Research
> Bureau of Meteorology
> GPO Box 1289
> Melbourne VIC 3001                 +61 3 9669 4487
> Australia                     address@hidden
> 
> 

Howdy Justin!

But what happens to users of other MPI libraries, like MPICH2?

For them, the MPI_Comm_f2c() will cause an error, right?

Thanks,

Ed

Ticket Details
===================
Ticket ID: ZXI-494839
Department: Support netCDF
Priority: Normal
Status: Open