[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #IKT-405664]: px_pgin, Write to unallocated



Hi Kevin,

> I came across an issue with the netcdf libraries when debugging a
> code linked to netcdf-4.2, though it also seems to occur with 4.1.3 as
> well. This is on a 64bit linux machine with the C-code compiled with
> Sun/Oracle studio-cc. I have been debugging the code with dbx with
> memory access checking switched on and I get the following error when a
> file is opened:
> 
> Write to unallocated (wua) on thread 18446744072984940384:
> Attempting to write 4 bytes at address 0x2adcd4cf4e60
> t@47127951527776 (l@17936) stopped in px_pgin at 0x00002adcd2897129
> 0x00002adcd2897129: px_pgin+0x00bc:     movl     $0x0000000000000000,(%
> rax)
> 
> I think Write to unallocated is a rather serious problem, though the
> code continues fine past this and runs fine outside of dbx. The actual
> line this occurs at is:
> 
> ierr = nc_open(filnam, NC_NOWRITE, &(ffield->ncid));
> 
> and I have checked that both filnam and ffield->ncid are properly
> allocated.
> 
> Is this a problem, can it be ignored or is something going on in the
> netcdf library I should be worried about.

It does sound like a serious problem, but I can't reproduce it.

I just tried rebuilding and running all the tests that "make check"
runs, using --enable-valgrind.  I used netCDF version 4.2.1.1, linked
against HDF5 version 1.8.11, both built with full debugging.

This runs every test (including many nc_open calls) prefixed with

  valgrind -q --error-exitcode=2 --leak-check=full

which checks for "write to unallocated" errors as well other kinds of
memory access errors and memory leaks.

I realize that the Sun/Oracle studio-cc dbx may implement memory
checking differently than valgrind.  Do you know that it detects
errors in memory access that valgrind misses?  If you have an example
we can reproduce, especially if it's in one of the tests run by "make
check", I'll try to determine the cause.  In my experience,
occasionally code that is aggressively optimized by compiling with -O4
or higher sometimes uncovers compiler optimization errors that
disappear with -O0, and that might also be an explanation for what
you're seeing ...

By the way, I configured and compiled netcdf-4.2.1.1 with:

   CFLAGS='-g3 -O0 -fno-inline' CPPFLAGS=-I${H5DIR}/include \
   LDFLAGS=-L${H5DIR}/lib LIBS=-ldl ./configure --disable-shared \
   --enable-valgrind-tests
   make check

using

   gcc (GCC) 4.5.1 20100924 (Red Hat 4.5.1-4)

on
   $ uname -a
   Linux spike.unidata.ucar.edu 2.6.35.14-106.fc14.x86_64 #1 SMP Wed Nov 23 
13:07:52 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

--Russ

Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: IKT-405664
Department: Support netCDF
Priority: Normal
Status: Closed