[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #XMQ-403517]: NetCDF-4.0-beta1 make check error



> Hi Ed,
> 
> I have modified run_par_test.sh to use the mpich2 commands. The problem
> appears to be with tst_parallel3. If I take out the line in
> run_par_test.sh that refers to tst_parallel3 then run_par_test.sh
> finishes successfully. tst_parallel3 fails when run alone and when in
> parallel from run_par_test.sh. My machine only has 4 CPUS so in the
> run_par_tests.sh file both tst_parallel and tst_parallel3 are run on 4
> processors:
> 
> mpiexec -n 4 ./tst_parallel
> mpiexec -n 4 ./tst_parallel3

tst_parallel3 will not work on a machine with only 4 CPUs. It needs
16. You can see this in the C code.

So it's not a problem if that test doesn't work for you on your 4 CPU
system.

Does parallel I/O seem to work well otherwise? I recently added some
changes to try and make the netCDF-4 parallel I/O performance the
exact same as the underlying HDF5 layer...


> 
> What should the normal output from the parallel tests be? Both
> tst_parallel and tst_parallel3 have a bunch of mpi output that is not
> controlled by tst_parallel tests themselves, at least not that I can see
> from the source code. I am attaching a file with the make_check output
> from the parallel test section (this include stdout and stderr output
> with some added line returns for easier reading). the output is saying
> the tests are successful. But the final result is still failure. The
> only line being printed to stderr in that section is "Attempting to use
> an MPI routine after finalizing MPICH". Any help would be greatly
> appreciated.

You're MPICH output is indeed not part of netCDF-4.

> 
> I configured and ran make check on the latest snapshot without the
> parallel tests and it finished without errors. It is still only the
> parallel tests that fail. However, even when I configure with
> --enable-parallel-tests it does not run the parallel tests. I have to
> add --enable-parallel which is not documented when you execute
> ./configure --help. It is only mentioned in the --enable-parallel-tests
> section. What if you made --enable-parallel-tests the default when
> --enable-parallel is included and made a --disable-parallel-tests?

Sorry, a bit of confusion. I forgot that I took away --enable-parallel
and turn it on automatically if I detect a parallel system. So it's no
longer needed. I have modified the build so that you don't need to put
--enable-parallel to get --enable-parallel-tests.

It will be in the next snapshot release, but no need for you to get
it, it contains no other changes for you...

> 
> On a side note: Is there a rhyme or reason as to why some of the make
> check output goes to stdout and some goes to stderr?
> 
> Dave
> 

I'm not sure about that, since our makefiles are generated by
automake. I suspect there is some rhyme and reason to it, but I don't
know the answer. ;-)

Thanks,

Ed

Ticket Details
===================
Ticket ID: XMQ-403517
Department: Support netCDF
Priority: Critical
Status: Closed