[netcdfgroup] netcdf-fortran issues with intel/openmpi

I'm trying to compile netcdf-fortran with openmpi and the intel compilers. With 4.2 I get the following error running the tests:

Testing netCDF parallel I/O through the F90 API.

 *** Testing netCDF-4 parallel I/O from Fortran 90.
 Error: No such file or directory
2
 Error: No such file or directory
2
 Error: Parallel operation on file opened for non-parallel access
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
libhdf5.so.6       000000305B2F9D73  Unknown               Unknown  Unknown
libhdf5.so.6       000000305B36F5DA  Unknown               Unknown  Unknown
libhdf5.so.6       000000305B544EED  Unknown               Unknown  Unknown
libhdf5.so.6       000000305B439923  Unknown               Unknown  Unknown
libhdf5.so.6       000000305B438F1A  Unknown               Unknown  Unknown
libhdf5.so.6       000000305B234AB7  Unknown               Unknown  Unknown
libhdf5.so.6       000000305B236680  Unknown               Unknown  Unknown
libnetcdf.so.6     0000003ACE44C56A  Unknown               Unknown  Unknown
libnetcdf.so.6     0000003ACE4996A8  Unknown               Unknown  Unknown
libnetcdf.so.6     0000003ACE449D6B  Unknown               Unknown  Unknown
libnetcdff.so.5    00007F8324CFD06B  Unknown               Unknown  Unknown
libnetcdff.so.5    00007F8324D00A9E  Unknown               Unknown  Unknown
lt-f90tst_paralle  0000000000409494  Unknown               Unknown  Unknown
lt-f90tst_paralle  000000000040892C  Unknown               Unknown  Unknown
libc.so.6          000000305121ECDD  Unknown               Unknown  Unknown
lt-f90tst_paralle  0000000000408829  Unknown               Unknown  Unknown
2
--------------------------------------------------------------------------
mpiexec has exited due to process rank 2 with PID 28157 on
node vulcan.cora.nwra.com exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).
--------------------------------------------------------------------------
FAIL: run_f90_par_test.sh
Testing netCDF parallel I/O through the F77 API...
7

 *** Testing netCDF-4 parallel I/O from Fortran 77.
7
7
7
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
lt-ftst_parallel   000000000047AE3A  Unknown               Unknown  Unknown
lt-ftst_parallel   0000000000479936  Unknown               Unknown  Unknown
lt-ftst_parallel   0000000000431C40  Unknown               Unknown  Unknown
lt-ftst_parallel   00000000004118DE  Unknown               Unknown  Unknown
lt-ftst_parallel   0000000000415073  Unknown               Unknown  Unknown
libpthread.so.0    0000003051E0F500  Unknown               Unknown  Unknown
libhdf5.so.6       000000305B2861D2  Unknown               Unknown  Unknown
libhdf5.so.6       000000305B27BD6A  Unknown               Unknown  Unknown
libhdf5.so.6       000000305B2577FA  Unknown               Unknown  Unknown
libhdf5.so.6       000000305B2CE7DD  Unknown               Unknown  Unknown
libhdf5.so.6       000000305B2D193B  Unknown               Unknown  Unknown
libhdf5.so.6       000000305B2D1395  Unknown               Unknown  Unknown
libhdf5.so.6       000000305B36EC5D  Unknown               Unknown  Unknown
libhdf5.so.6       000000305B2CB2DE  Unknown               Unknown  Unknown
libhdf5.so.6       000000305B2352DC  Unknown               Unknown  Unknown
libc.so.6          0000003051235DB2  Unknown               Unknown  Unknown
lt-ftst_parallel   0000000000423037  Unknown               Unknown  Unknown
lt-ftst_parallel   0000000000408C0E  Unknown               Unknown  Unknown
lt-ftst_parallel   00000000004087EC  Unknown               Unknown  Unknown
libc.so.6          000000305121ECDD  Unknown               Unknown  Unknown
lt-ftst_parallel   00000000004086E9  Unknown               Unknown  Unknown
--------------------------------------------------------------------------
mpiexec has exited due to process rank 3 with PID 28595 on
node vulcan.cora.nwra.com exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).
--------------------------------------------------------------------------
FAIL: run_f77_par_test.sh


Any ideas?

In netcdf-fortran-4.2/f90/netcdf4_file.f90:

     if (present(comm)) then
        nf90_create = nf_create_par(path, cmode, comm, info, ncid)
     else
        nf90_create = nf_create(path, cmode, ncid)
     end if

Looks like it creates in parallel mode if comm is present, which it should be:

call handle_err(nf90_create(FILE_NAME, mode_flag, ncid, comm = MPI_COMM_WORLD, &
       info = MPI_INFO_NULL))

So I'm at a loss.  I do not seem to see this with gcc/gfortran.



Tried to compile 4.4-beta1 and I get:

mpif90 -I../../fortran -I../fortran -g -c -o f90tst_parallel.o ../../nf_test/f90tst_parallel.f90 ../../nf_test/f90tst_parallel.f90(85): error #6404: This name does not have a type, and must have an explicit type. [NF90_MPIIO]
  mode_flag = IOR(mode_flag, nf90_mpiio)
-----------------------------^
../../nf_test/f90tst_parallel.f90(85): error #6363: The intrinsic data types of the arguments must be the same. [IOR]
  mode_flag = IOR(mode_flag, nf90_mpiio)
-----------------------------^
../../nf_test/f90tst_parallel.f90(120): error #6363: The intrinsic data types of the arguments must be the same. [IOR]
  call handle_err(nf90_open(FILE_NAME, IOR(nf90_nowrite, nf90_mpiio), ncid, &
---------------------------------------------------------^
compilation aborted for ../../nf_test/f90tst_parallel.f90 (code 1)

Looks like you are missing a definition of nf90_mpiio in fortran/netcdf_constants.f90?

In 4.2 it's defined here:

./netcdf-fortran-4.2/f90/netcdf4_constants.f90:integer, parameter, public :: nf90_mpiio = 8192, nf90_mpiposix = 16384, &


--
Orion Poplawski
Technical Manager                     303-415-9701 x222
NWRA, Boulder Office                  FAX: 303-415-9702
3380 Mitchell Lane                       orion@xxxxxxxx
Boulder, CO 80301                   http://www.nwra.com



  • 2012 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: