[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #IKM-527849]: netcdf parallel compilation: nc_test fail



Michael,

> I actually tried and failed building hdf5 first, I think I've tried it
> in 10-20 different ways, some more subtle than others. The last attempt
> was after reinstalling the system and adding:
> 
> mpich2 libmpich2-1.2 libmpich2-dev libmpich1.0-dev libmpich1.0gf
> gfortran g++ libstdc++6-4.4-dev zlib1g zlib1g-dev
> 
> then unpacking hdf5-1.8.11 and running: ./configure
> --prefix=/local/opt/hdf5 --enable-fortran --disable-shared --enable-parallel

If you're just building HDF5 for netCDF-4, there's no need to --enable-fortran,
as neither the netCDF C or netCDF Fortran libraries depend on the HDF5 Fortran 
API.  However, I don't think that's relevant to the problem you encountered.

> I've attached the config.log and make check install output. I know hdf5
> is not yours to support, but maybe you can have a look at it anyhow.

The error you encountered is in testing HDF5 parallel I/O using the mpich
library:

  Testing  -- test lower dim size comp in span tree to mpi derived type (tldsc) 
  *** glibc detected *** ./testphdf5: double free or corruption (!prev): 
0x00000000016f9450 ***
  *** glibc detected *** ./testphdf5: double free or corruption (!prev): 
0x000000000174ae30 ***
  ======= Backtrace: =========
  /lib/libc.so.6(+0x78bb6)[0x7f207733fbb6]
  /lib/libc.so.6(cfree+0x73)[0x7f2077346483]
  /usr/lib/libmpich.so.1.2(MPID_Dataloop_create_struct+0x8e2)[0x7f2077b1f632]

First, do you really need parallel I/O for what you intend to do with netCDF-4?
If so, you'll have to send your question to HDF5 support, as we have little
expertise in that area.  If not, you could omit the --enable-parallel from the
HDF5 configure invocation.

Here's the way I invoked configure the last time I built parallel HDF5-1.8.11
successfully on a Linux Fedora platform, in a debug configuration:

  env CC=mpicc ../configure --disable-shared --enable-debug 
--disable-production --enable-parallel --enable-build-all 
--prefix=/machine/russ/installs/h5_1811_db && make all && make check && make 
install

and it worked.  Note that CC=mpicc is necessary, but you could omit 
--enable-debug,
--disable-production, --enable-build-all, and use your own --prefix= ...

--Russ

> > Hi Michael,
> >
> >> I read in another thread that you wanted config.log and output from the
> >> make check, so here you have that aswell.
> > I think the problem is the old version of HDF5 (1.8.4) you are using with
> > a new version of netCDF (4.3.0).  We didn't test netCDF 4.3.0 with versions
> > of HDF5 before 1.8.9, and recommend using it with HDF5 1.8.11.  Earlier
> > versions of HDF5 had bugs that affected netCDF, which tests that those
> > bugs have been fixed when "make check" is invoked. I just tried building
> > netCDF-4.3.0 with HDF5-1.8.4p1, and it got an error from "make check"
> > similar to what you are seeing.
> >
> > I recommend building HDF5-1.8.11 from source and installing that before
> > trying to build netCDF-4.3.0.  The HDF5 build is very robust and builds
> > on most systems with no problems.  You can just use all the defaults for
> > configure except you should add --enable-parallel and use CC=mpicc if you
> > want parallel HDF5, required for parallel netCDF-4.
> >
> > Here's instructions:
> >
> >    http://www.unidata.ucar.edu/netcdf/docs/build_default.html
> >    http://www.unidata.ucar.edu/netcdf/docs/build_parallel.html
> >
> > --Russ
> >
> >> -------- Original Message --------
> >> Subject:   Fwd: netcdf parallel compilation: nc_test fail
> >> Date:      Wed, 18 Sep 2013 11:58:42 +0200
> >> From:      Michael Burger <address@hidden>
> >> Reply-To:  address@hidden
> >> To:        address@hidden
> >>
> >>
> >>
> >> I forgot to mention that I also set:
> >>
> >> declare -x H5DIR=/usr
> >> declare -x CPPFLAGS="-I/usr/inlude"
> >> declare -x CC=mpicc
> >> declare -x LDFLAGS=-L/usr/lib
> >> declare -x LIBS=-ldl
> >>
> >>
> >> -------- Original Message --------
> >> Subject:   netcdf parallel compilation: nc_test fail
> >> Date:      Wed, 18 Sep 2013 11:53:35 +0200
> >> From:      Michael Burger <address@hidden>
> >> Reply-To:  address@hidden
> >> To:        address@hidden
> >>
> >>
> >>
> >> I installed Ubuntu 10.04 LTS (amd64) and added:
> >>
> >> mpich2
> >> mpich-bin
> >> libmpich2-1.2
> >> libmpich2-dev
> >> zlib1g
> >> zlib1g-dev
> >> gfortran
> >> g++
> >> libstdc++6-4.4-dev
> >> hdf5-tools
> >> libhdf5-mpich-1.8.4
> >> libhdf5-mpich-dev
> >> libjpeg62-dev
> >> libmpich1.0-dev
> >> libmpich1.0gf
> >> libhdf5-doc
> >> mpi-doc
> >>
> >> uname -a:
> >>
> >> Linux misu197 2.6.32-51-generic #113-Ubuntu SMP Wed Aug 21 19:46:35 UTC
> >> 2013 x86_64 GNU/Linux
> >>
> >> Unpacked netcdf-4.3.0.tar.gz
> >> ./configure --prefix=/local/opt/netcdf --disable-shared
> >> --enable-parallel-tests
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> > Russ Rew                                         UCAR Unidata Program
> > address@hidden                      http://www.unidata.ucar.edu
> >
> >
> >
> > Ticket Details
> > ===================
> > Ticket ID: IKM-527849
> > Department: Support netCDF
> > Priority: Normal
> > Status: Closed
> >
> 
> 
> 
Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: IKM-527849
Department: Support netCDF
Priority: Normal
Status: Closed