[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #UCS-450316]: NetCDF 4 on Mac OS X Leopard



> Hi Ed,
>
> I grabbed the latest daily snapshot and built that. 'make test' hangs
> on program 'tst_h_par' in the 'libsrc4' directory. I modified the
> 'libsrc4/run_par_tests.sh' shell script, changing 'mpiexec -n' to
> 'mpirun -np':
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> #!/bin/sh
>
> # This shell runs some parallel tests.
>
> # $Id: run_par_tests.sh,v 1.2 2007/12/20 16:25:26 ed Exp $
>
> # Even for successful runs, mpiexec seems to set a non-zero return
> # code!
> #set -e
> echo ""
> echo "Testing parallel I/O with HDF5..."
>
> echo "Running tst_h_par on 1 processor..."
> mpirun -np 1 ./tst_h_par
>
> echo "Running tst_h_par on 2 processors..."
> mpirun -np 2 ./tst_h_par
>
> echo "Running tst_h_par on 4 processors..."
> mpirun -np 4 ./tst_h_par
>
> echo "SUCCESS!!!"
>
> exit 0
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>
> then ran the script by hand. It seems to be running the tests in
> 'tst_h_par.c' correctly but hangs at the end, either when closing the
> files or finalizing. I can ^C the process to make it continue:
>

Howdy Craig!

Hmmm. I suspect this is not a netcdf problem, but a problem with mpirun or your
environment, or my test scripts on your environment.

Some background: the tst_h_par.c test is pure HDF5 code, no netcdf-4 calls. So
you can take this problem to the HDF5 help desk and see if they have any
input.

The message: "*** Tests successful!" is only printed after all files have
successfully closed and the return codes all checked. In fact, printing that is
just about the last thing that is done by the program before exiting, and it is
only printed if all the tests pass and all the files close cleanly. The only
thing done after that is MPI_Finalize.

So I believe your netcdf-4.0 install is working correctly, and if you want to
start using it, go ahead.

But I cannot understand what the heck is going on with yours scripts. Try
taking out the "set -e" command and rerunning the script to see if it still
hangs.

Do you have a known-working MPI application? Can you write your own script to
call it and see if it works OK?

Do you have a parallel debugger? If so, can you run it with tst_h_par and see
whether the MPI_Finalize call completes?

Thanks,

Ed


Ticket Details
===================
Ticket ID: UCS-450316
Department: Support netCDF
Priority: Normal
Status: Open