Re: [netcdf-hdf] bug in MPI cleanup

Hi Rob,

On Sep 14, 2007, at 11:22 AM, Robert Latham wrote:

On Thu, Sep 13, 2007 at 02:05:07PM -0500, Quincey Koziol wrote:
On Sep 13, 2007, at 2:00 PM, Robert Latham wrote:
Really?  Call H5Close inside a netcdf4 code?  Well, I can do that,
sure.  That seems to lack a certain symmetry, no?

I agree with you, but I don't think there's a corresponding "shut
the netCDF-4 library down" API routine.  :-)

Here's a trick that we do in ROMIO: we attach an attribute to the
communicator.  This attribute has a hook for a function to run when
it's deleted. We hook in a ROMIO cleanup routine there.  Then when
MPI_Finalize runs, the MPI implementation deletes attributes on all
communicators before freeing them, and ROMIO's cleanup routine fires
off.

The code sort of looks like this:

/* ADIO_Init_keyval: a global variable */
if (ADIO_Init_keyval == MPI_KEYVAL_INVALID) {
        MPI_Keyval_create(MPI_NULL_COPY_FN,
                ADIOI_End_call, &ADIO_Init_keyval, (void *)0);
        MPI_Attr_put(MPI_COMM_WORLD, ADIO_Initkeyval, (void *)0);
        ADIO_Init(&status);
}

ADIOI_End_call just wraps around ADIO_End, and ADIO_End deallocates
memory, cleans up data structures, and shuts down any other interfaces
ROMIO fired up.  Note that we put the attribute on COMM_WORLD: we
don't care what communicator the end-user fed ROMIO; we just want a
cleanup routine to fire when MPI_Finalize is invoked.

ROMIO puts this in the open and delete paths.  For NetCDF, you could
put this in nc_open_par and nc_create_par

I don't know if this is a perfect fit for NetCDF-4, but at least it's
one way to hide the H5Close call from NetCDF-4 end-users.

That's a clever way to help the problem. It's a bit "weird" in the sense that nc_open_par/nc_create_par are per-file and the H5close is per-library/per-application (more like MPI_Init/MPI_Finalize). It should work fine if the library is opened and closed repeatedly, but it's not something we stress a lot in our tests.

        Quincey

Attachment: smime.p7s
Description: S/MIME cryptographic signature