[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20041130: NetCDF 3.6.0 Bugs



>To: address@hidden
>From: "Kevin W. Thomas" <address@hidden>
>Subject: NetCDF 3.6.0 Bugs
>Organization: OU
>Keywords: 200412010015.iB10F6lI010328 netCDF 3.6.0

Kevin,

> Unless you have WRFSI built, it will be a little difficult to replicate the
> problem.

I don't have WRFSI, but in an effort to try to duplicate the problem,
I just built the 64-bit version of netCDF-3.6.0-beta6 on an IRIX64 6.5
platform using the following environment variable settings before
invoking the "configure" script:

  CFLAGS='-g -64'
  FFLAGS='-g -64'
  CXXFLAGS='-g -64'

The build completed fine and "make test" ran successfully.

Then I generated a little Fortran program to create a netCDF file in
the new 64-bit offset format using 

      iret = nf_create('bug.nc', NF_64BIT_OFFSET, ncid)

> Let me describe my observations yesterday, and what I did to verify the
> problem.
> 
> o  "nf_create" eventually calls "nc__create_mp".  A printf shows that
>     variable "ioflags", which is used to make the decision, is 0x202.

This should be 0x200, not 0x202.  Offhand I don't see how this can
be 0x202, and I've tried to duplicate it to no avail.  There is no
meaning for the "2" bit in ioflags, so the fact that it is set is
puzzling.  It looks like something else is clobbering the ioflags
value.  But since the "2" bit has no meaning, I don't think it should
cause a problem if it gets set, since nothing ever tests for it.

>    With a printf I've verified that "sizeof_off_t" is set to 8.

Yes, same here,

> o  WRFSI calls "nf_enddef".  Following that logic:
> 
>       nf_enddef calls nc_enddef
> 
>       nc_enddef calls NC_endef
> 
>               "ncp" isn't passed, so there is a:
>                       NC *ncp in the code.
>               Before the "return (NC_endef(ncp, 0, 1, 0, 1 );" line,
>                       ncp->flags is 0x8.

Nope, in my test, ncp->flags is 0x302 at this point.  Looking at the
source, I don't see how it can get set to 0x8.

>       NC_endef calls NC_begins
> 
>               Before the NC_begins call ncp->flags is 0x8.
> 
>       NC_begins is where the failure occurs.
> 
>               Before the
> 
>                       if (fIsSet(ncp->flags, NC_64BIT_OFFSET)) {
> 
>               line, ncp->flags is 0x8.  That results in "sizeof_of_t"
>               being 4 instead of 8.

Since ncp->flags is 0x302, sizeof_off_t gets correctly set to 8
rather than 4.

In case you have time to see if you still get the buggy behavior when
compiling the same Fortran program I tried to duplicate the bug with,
here it is:

      program fgennc
      include 'netcdf.inc'
* error status return
      integer  iret
* netCDF id
      integer  ncid
* to save old fill mode before changing it temporarily
      integer  oldmode
* dimension ids
      integer  n_dim
* dimension lengths
      integer  n_len
      parameter (n_len = 3)
* variable ids
      integer  v_id
* rank (number of dimensions) for each variable
      integer  v_rank
      parameter (v_rank = 1)
* variable shapes
      integer  v_dims(v_rank)
* data variables
      real  v(n_len)
* enter define mode
*      iret = nf_create('bug.nc', OR(NF_CLOBBER,NF_64BIT_OFFSET), ncid)
      iret = nf_create('bug.nc', NF_64BIT_OFFSET, ncid)
      call check_err(iret)
* define dimensions
      iret = nf_def_dim(ncid, 'n', 3, n_dim)
      call check_err(iret)
* define variables
      v_dims(1) = n_dim
      iret = nf_def_var(ncid, 'v', NF_REAL, v_rank, v_dims, v_id)
      call check_err(iret)
* don't initialize variables with fill values
      iret = nf_set_fill(ncid, NF_NOFILL, oldmode)
      call check_err(iret)
* leave define mode
      iret = nf_enddef(ncid)
      call check_err(iret)
* store v
      data v /1, 2, 3/
      iret = nf_put_var_real(ncid, v_id, v)
      call check_err(iret)
       
      iret = nf_close(ncid)
      call check_err(iret)
      end
       
      subroutine check_err(iret)
      integer iret
      include 'netcdf.inc'
      if (iret .ne. NF_NOERR) then
      print *, nf_strerror(iret)
      stop
      endif
      end

I compiled the above with 

  f77 -o bug -g -64 -I../../include/netcdf.inc bug.f -L../../lib -lnetcdf

in the src/fortran/ directory of the build, after running "make
install" to get the library and include files installed ...

--Russ

>Hi Kevin,
>
>> I have a couple of netCDF 3.6.0_beta6 bugs to report.
>> 
>> (1)  Scenerio...
>> 
>>      I'm trying to run WRFSI 2.0.1 on an SGI with 64-bit binaries.  My goal
>>      is to process a huge grid.  The problem is that I start getting a lot
>>      of netCDF errors due to functions returning NC_EVARSIZE, even though
>>      I modified the code to call "NF_OPEN" correctly.
>
>To create a netCDF file with 64-bit offsets, you must explicitly call
>NF_CREATE with the 64-bit offset flag.  There's no way to get that
>flag set correctly by just calling NF_OPEN (and there shouldn't be,
>since then the file might have both 32-bit offsets and 64-bit
>offsets).  
>
>Are you calling NF_CREATE rather than NF_OPEN with something like:
>
>  iret = nf_create('foo.nc',
>                   OR(NF_NOCLOBBER,NF_64BIT_OFFSET),
>                   ncid)
>
>>      I've isolated the problem to a "NF_ENDDEF" call.  The call eventually
>>      ends up in "NC_begins".  ncp->flags is zero, so "sizeof_off_t" is always
>>      set to 4.  The problem is that earlier on when the "ncp" pointer is
>>      first set, the value of "flags" is never set.
>> 
>>      My workaround is to hardwire "sizeof_off_t" to 8 in "NC_begins", as I
>>      require this option.
>
>If this is still a problem even after the correct NF_CREATE call, we'd
>like to be able to duplicate the problem here and fix it.  In that
>case, could you at least send us a CDL of the file you are trying to
>create and any other details we would need to duplicate the problem?  
>
>Thanks!
>
>--Russ