Re: HDF5 hangs...

Hi Ed,

> I am having a really weird problem. At a certain point in my code, the
> following code hangs in the H5Dcreate call.
> 
> This code works a bunch of times, then, at some point (I'm doing lots
> of other HDF stuff, but not with this file), it hangs.
> 
> I don't know what the deal is.
> 
>    {
>       hid_t dsid = 0;
>       hid_t typeid1 = H5Tcopy(H5T_NATIVE_CHAR);
>       hid_t plistid1 = H5P_DEFAULT;
>       hid_t spaceid1 = 0;
>       hid_t hdfid = 0;
>       if ((hdfid = H5Fcreate("ccc_test.h5", H5F_ACC_TRUNC, H5P_DEFAULT, 
> H5P_DEFAULT)) < 0)
>        return (-1);
>       if ((spaceid1 = H5Screate(H5S_SCALAR)) < 0)
>        return (-1);
>       if ((dsid = H5Dcreate(hdfid, "scaley", 
>                           H5T_NATIVE_CHAR, spaceid1, plistid1)) < 0)
>        return (-1);
>       if (spaceid1 > 0) H5Sclose(spaceid1);
>       if (hdfid > 0) H5Fclose(hdfid);
>       if (dsid > 0) H5Dclose(dsid);
>       dsid = 0;
>    }
> 
> When I use Cntl-C to interrupt the program, I get this message:
> 
> Program received signal SIGINT, Interrupt.
> 0x400c997a in malloc_consolidate () from /lib/i686/libc.so.6
> 
> Somehow there is some malloc issue in H5Dcreate. Are you checking all
> your malloc returns to see that the memory you are using is really
> being allocated?
    Yes, we are extremely diligent in checking the return values from all the
functions we call (although bugs do slip through occasionally).

> I can't reproduce this in a short program (yet), but I'll keep trying...
    It definitely sounds like a memory problem.  If you can give me a
reasonably short program, I'll run it through purify here and we should have a
fix quickly.

    Quincey
>From owner-netcdf-hdf@xxxxxxxxxxxxxxxx 06 2003 Nov -0700 11:35:49 
Message-ID: <wrxvfpx2vlm.fsf@xxxxxxxxxxxxxxxxxxxxxxx>
Date: 06 Nov 2003 11:35:49 -0700
From: Ed Hartnett <ed@xxxxxxxxxxxxxxxx>
In-Reply-To: <200310251310.h9PDAuuu038514@xxxxxxxxxxxxxxxxxxxxxx>
To: Quincey Koziol <koziol@xxxxxxxxxxxxx>
Subject: Re: how about adding cygwin to the list of supported HDF platforms?
Received: (from majordo@localhost)
        by unidata.ucar.edu (UCAR/Unidata) id hA6IZpDZ014743
        for netcdf-hdf-out; Thu, 6 Nov 2003 11:35:51 -0700 (MST)
Received: from rodney.unidata.ucar.edu (rodney.unidata.ucar.edu 
[128.117.140.88])
        by unidata.ucar.edu (UCAR/Unidata) with ESMTP id hA6IZoOb014651;
        Thu, 6 Nov 2003 11:35:50 -0700 (MST)
Organization: UCAR/Unidata
Keywords: 200311061835.hA6IZoOb014651
Cc: netcdf-hdf@xxxxxxxxxxxxxxxx, Elena Pourmal <epourmal@xxxxxxxxxxxxx>
References: <200310251310.h9PDAuuu038514@xxxxxxxxxxxxxxxxxxxxxx>
Lines: 49
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-netcdf-hdf@xxxxxxxxxxxxxxxx
Precedence: bulk

Quincey Koziol <koziol@xxxxxxxxxxxxx> writes:

> Hi Ed,
> 
> > How about adding Cygwin to the supported platforms? In terms of
> > porting it is almost identical to linux, and lots of people use it,
> > and, most importantly, I use it...
>     We've had requests for supporting Cygwin, but have always felt that we
> didn't have the resources to adequately manage a port to it.
> 
> > I'm about to grab the 1.6.1 source and try it, so I'll let you know
> > how it comes out.
>     If it's not too difficult and you can help me get things set up, I 
> wouldn't
> mind getting it working on my Windows machine.
> 
>     Quincey

Wow, it was super easy!

It built almost 100% right out of the box. I had to change one file,
perform/iopipe.c.

Where it formerly had:

#ifdef H5_HAVE_WINSOCK_H
#include <Winsock.h>
#endif 


I changed it to:

#if defined(_WIN32)
#ifdef H5_HAVE_WINSOCK_H
#include <Winsock.h>
#endif 
#endif 

I got this fix from the cygwin patch for 1.6.0.

After this change it built and installed. I ran a couple of the
example programs and they looked fine. I also did a make check and
that worked.

So it seems like it would be pretty easy to add Cygwin to your list of
supported platforms...

Thanks,

Ed

>From owner-netcdf-hdf@xxxxxxxxxxxxxxxx 07 2003 Nov -0700 10:39:10 
Message-ID: <wrxy8usxem9.fsf@xxxxxxxxxxxxxxxxxxxxxxx>
Date: 07 Nov 2003 10:39:10 -0700
From: Ed Hartnett <ed@xxxxxxxxxxxxxxxx>
In-Reply-To: <200311041705.hA4H53xE039618@xxxxxxxxxxxxxxxxxxxxxx>
To: Quincey Koziol <koziol@xxxxxxxxxxxxx>
Subject: Re: HDF5 hangs...
Received: (from majordo@localhost)
        by unidata.ucar.edu (UCAR/Unidata) id hA7HdCXO026334
        for netcdf-hdf-out; Fri, 7 Nov 2003 10:39:12 -0700 (MST)
Received: from rodney.unidata.ucar.edu (rodney.unidata.ucar.edu 
[128.117.140.88])
        by unidata.ucar.edu (UCAR/Unidata) with ESMTP id hA7HdBOb026329;
        Fri, 7 Nov 2003 10:39:11 -0700 (MST)
Organization: UCAR/Unidata
Keywords: 200311071739.hA7HdBOb026329
Cc: netcdf-hdf@xxxxxxxxxxxxxxxx
References: <200311041705.hA4H53xE039618@xxxxxxxxxxxxxxxxxxxxxx>
Lines: 23
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-netcdf-hdf@xxxxxxxxxxxxxxxx
Precedence: bulk

Quincey Koziol <koziol@xxxxxxxxxxxxx> writes:

> > I can't reproduce this in a short program (yet), but I'll keep trying...
>     It definitely sounds like a memory problem.  If you can give me a
> reasonably short program, I'll run it through purify here and we should have a
> fix quickly.

OK, it was all my fault. Yes, I'll say that again, it was all my
fault. For my sins I have been wandering the purgatory of memory leak
hunting, and finally I found a pointer I was freeing twice. Woops!

It caused the HDF5 library to hang only because you guys also are
using malloc, and that's just how long it took to fail.

Anyway, I did discover a very useful gcc extension! If you set the
environment variable MALLOC_CHECK_ to 1, you will get warnings on
stderr when you free invalid pointers. (But not, apparently if you are
using the debugger - you have to run without it for this.)

For punishement I will write 100 times on my whiteboard:
"I will not be too free my free calls!"

Ed

>From owner-netcdf-hdf@xxxxxxxxxxxxxxxx 09 2003 Nov -0700 08:50:08 
Message-ID: <wrxy8upmthr.fsf@xxxxxxxxxxxxxxxxxxxxxxx>
Date: 09 Nov 2003 08:50:08 -0700
From: Ed Hartnett <ed@xxxxxxxxxxxxxxxx>
To: netcdf-hdf@xxxxxxxxxxxxxxxx
Subject: Space, the final frontier...
Received: (from majordo@localhost)
        by unidata.ucar.edu (UCAR/Unidata) id hA9FoAY2023829
        for netcdf-hdf-out; Sun, 9 Nov 2003 08:50:10 -0700 (MST)
Received: from rodney.unidata.ucar.edu (rodney.unidata.ucar.edu 
[128.117.140.88])
        by unidata.ucar.edu (UCAR/Unidata) with ESMTP id hA9Fo8Ob023756
        for <netcdf-hdf@xxxxxxxxxxxxxxxx>; Sun, 9 Nov 2003 08:50:09 -0700 (MST)
Organization: UCAR/Unidata
Keywords: 200311091550.hA9Fo8Ob023756
Lines: 16
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-netcdf-hdf@xxxxxxxxxxxxxxxx
Precedence: bulk


Here's a snippet of code:

      /* Create a space for the memory, just big enough to hold the slab
         we want. Then select it all. */
      if ((mem_spaceid = H5Screate_simple(var_info.ndims, count, NULL)) < 0) 
         BAIL(NC_EHDFERR);
      if (H5Sselect_all(mem_spaceid) < 0)
         BAIL(NC_EHDFERR);

Do I need to select the space, or will it automatically be selected
when created?

Thanks!

Ed