Re: [netcdf-hdf] bug in MPI cleanup

Hi Rob,

On Sep 13, 2007, at 1:44 PM, Robert Latham wrote:

Hi

I've got a small netcdf4 program that when run, gives the error
"Attempting to use an MPI routine after finalizing MPICH".  This
initially had me puzzled, because the only thing I do after calling
MPI_Finalize() is 'return 0'.

Turns out, HDF5 hooks a routine into atexit(3) that cleans up MPI
structures.  Parts of this cleanup routine should not be run after the
MPI_Finalize:  here's the backtrace at the moment where the error
about using MPI routines after finalizing MPICH is printed:

#1 0x084bdcea in PMPI_Comm_free (comm=0x8711c20) at /home/robl/ work/mpich2/src/
mpi/comm/comm_free.c:73
#2 0x081b687f in H5FD_mpi_comm_info_free (comm=0x8711c20, info=0x8711c24) at ..
/../src/H5FDmpi.c:326
#3 0x081ba2f0 in H5FD_mpio_fapl_free (_fa=0x8711c20) at ../../src/ H5FDmpio.c:87
0
#4 0x0819936b in H5FD_pl_close (driver_id=134217729, free_func=0x81ba113 <H5FD_
mpio_fapl_free>, pl=0x8711c20) at ../../src/H5FD.c:625
#5 0x08199f4c in H5FD_fapl_close (driver_id=134217729, fapl=0x8711c20) at ../..
/src/H5FD.c:791
#6 0x0829154c in H5P_facc_close (fapl_id=167772177, close_data=0x0) at ../../sr
c/H5Pfapl.c:431
#7  0x08279aca in H5P_close (_plist=0x87119c0) at ../../src/H5P.c:5370
#8 0x08202738 in H5I_clear_type (type=H5I_GENPROP_LST, force=0) at ../../src/H5
I.c:604
#9  0x0826949d in H5P_term_interface () at ../../src/H5P.c:488
#10 0x080c416b in H5_term_library () at ../../src/H5.c:266
#11 0xb7d979d9 in exit () from /lib/tls/i686/cmov/libc.so.6
#12 0xb7d80ec4 in __libc_start_main () from /lib/tls/i686/cmov/ libc.so.6
#13 0x08085171 in _start ()


Since my test program called MPI_Finalize, it's incorrect for the HDF5
library to also call MPI_Comm_free.

I would advise not hooking MPI-IO cleanup into atexit(3): you already
correctly make the user call MPI_Init and MPI_Finalize.  Can hdf5
cleanup instead occur as part of nc_close?

The way we normally suggest taking care of this sort of issue in MPI programs is for the application program to call H5close() immediately before calling MPI_Finalize(). Will that work for you?

        Quincey


Thanks
==rob

--
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B

--
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B
_______________________________________________
netcdf-hdf mailing list
netcdf-hdf@xxxxxxxxxxxxxxxx
For list information or to unsubscribe, visit: http:// www.unidata.ucar.edu/mailing_lists/


Attachment: smime.p7s
Description: S/MIME cryptographic signature