[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #VRU-236841]: Re: Known problem URL for large block size, silent file corruption



Hi Charlie,

As you've no doubt seen, I finally posted the netcdfgroup announcement about the
NOFILL bug.  Please feel free to announce your 4.0.8 NCO release any time.

I think your decision to make fill mode the default so its immune to the nofill 
bug in
any version of netCDF is the right one.

--Russ

> Thanks for the update. I'm still surprised that this wasn't
> an NCO bug because for like the last decade every NCO bug report
> I receive has been either a user error or an NCO bug or "feature".
> 
> NCO 4.0.8 is ready to go. It works around the problem simply
> by avoiding NC_NOFILL (assuming NOFILL mode is still prerequisite
> to triggering the problem). I figure NCO 4.0.8 can be installed
> and trusted on _any_ version of netCDF, although it will write
> files more slowly because it does not invoke NOFILL mode.
> 
> My thinking is that it is important for there to be an NCO
> version that _cannot_ trigger the bug because some NCO users
> install NCO themselves, yet rely on sysadmins to
> upgrade libnetcdf itself. And sysadmins aren't always responsive
> to "urgent please upgrade this now" requests. So those users
> (and debian/redhat types) can just install 4.0.8 without having
> to wait for netCDF 4.1.3 to appear in their favorite package format
> or distribution.
> 
> I am at an impasse on testing whether netCDF 4.1.2 alleviated
> any DAP non-transparencies. But that is not urgent.
> 
> So...should I release NCO 4.0.8 now (i.e., tomorrow) or wait?
> 
> The NCO 4.0.8 code has been posted and the website
> updated and the tentative (comments welcome) release notes are here:
> 
> http://nco.cvs.sf.net/viewvc/nco/nco/doc/ANNOUNCE
> 
> though I will defer making this announcement until you
> agree that the time is right.
> 
> c
> 
> Le 28/04/2011 09:19, Unidata netCDF Support a écrit :
> > Hi Charlie,
> >
> >> Sounds like you are close to a fix for this nasty bug.
> >> I'll test your fix on cisl bluefire and mirage, if you want.
> >> And I'll wait awhile until releasing 4.0.8.
> >
> > Just to keep you appised of progress, I've checked in a fix to our svn 
> > trunk, consisting of a 20-line addition to the libsrc/posixio.c code.  The 
> > conditions for the bug appear to be pretty rare, but are more likely with 
> > larger disk block sizes.  Examples of the bug with small disk block sizes 
> > require relatively small files and involve:
> >
> >   - writing data to a file in nofill mode
> >   - writing more than one disk-block beyond the end of the file, as might
> >     happen in writing the last slice of a multidimensional variable before
> >     writing other slices
> >   - crossing disk-block boundaries with the region to be written
> >   - having the in-memory buffer in a state in which the region to be written
> >     corresponds to the upper half of the buffer and recently written data in
> >     the lower half of the buffer hasn't been flushed to disk yet.
> >
> > The last condition makes it difficult to give users an easy way to determine
> > whether they have been a victim of this problem.  I'm still struggling with
> > a better description of the conditions under which it might occur, and I 
> > still
> > need to understand why we can duplicate the problem for 4K disk blocks if we
> > use the double-underbar function nc__create(), but not if we use the more
> > common nc_create().
> >
> > When I have that mystery solved, I should be able to send out a netcdfgroup
> > posting, and maybe create an FAQ or blog entry about the bug with more
> > information than people are likely to want to read in an email posting.
> >
> > --Russ
> >
> > would
> >
> > --Russ
> >
> >
> > Russ Rew                                         UCAR Unidata Program
> > address@hidden                      http://www.unidata.ucar.edu
> >
> >
> >
> > Ticket Details
> > ===================
> > Ticket ID: VRU-236841
> > Department: Support netCDF
> > Priority: Normal
> > Status: Closed
> >
> >
> 
> 
> --
> Charlie Zender, Department of Earth System Science
> University of California, Irvine (949) 891-2429 :)
> 
> 

Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: VRU-236841
Department: Support netCDF
Priority: Normal
Status: Closed