[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 19990312: mangled doubles, 3.4, NC_UNLIMITED first dim?



Phil,

> cc: address@hidden
> From: Phil S <address@hidden>
> Subject: mangled doubles, 3.4, NC_UNLIMITED first dim?
> Organization: SNL
> Keywords: 199903122213.PAA00700 netCDF

In the above message, you wrote:

> I apologize for not narrowing down this problem further, but I am having
> problems with my netcdf file getting corrupted doubles in it. These are
> for a data record that is NC_UNLIMITED on its first dimension.
> 
> My interaction with netCDF has been through our application library
> interface EXODUS II written by Larry Schoof.
> 
> Interestingly, test arrays can be written and immediately re-read into a
> different array without sign of corruption. It would appear that any
> corruption occurs whenever the temporary buffer areas are written to
> disk.
> 
> It happens on both Solaris 2.6 and under AIX 4.1.3 so it does not appear
> to be and OS "feature".
> 
> If I use 4 byte floats, the written data in the file (shown by ncdump)
> appears faithful and uncorrupted.
> 
> Through the eyes of ncdump the corrupted data looks like
> 
> .
> .
> .
> .
>     404.004, 405.004, 406.004, 407.004, 408.004, 409.004, 410.004,
> 411.004, 
>     412.004, 413.004, 414.004, 415.004, 416.004, 417.004, 418.004,
> 419.004, 
>     420.004, 421.004, 422.004, 423.004, 424.004, 425.004, 426.004,
> 427.004, 
>     428.004, 429.004, 430.004, 431.004, 432.004, 433.004, 434.004,
> 435.004, 
>     436.004, 437.004, 438.004, 439.00390625, 5.29980882362664e-315, 
>     5.30498947741318e-315, 5.30757980430645e-315, 5.31017013119972e-315, 
>     5.31146529464635e-315, 5.31276045809299e-315, 5.31405562153962e-315, 
>     5.31535078498625e-315, 5.31599836670957e-315, 5.31664594843289e-315, 
>     5.3172935301562e-315, 5.31794111187952e-315, 5.31858869360284e-315, 
>     5.31923627532616e-315, 5.31988385704947e-315, 5.32053143877279e-315, 
>     5.32085522963445e-315, 5.32117902049611e-315, 5.32150281135777e-315, 
>     5.32182660221942e-315, 5.32215039308108e-315, 5.32247418394274e-315, 
> .
> .
> .
> .
> 
> generally occurring many thousands of records down into the data. I can
> send the netcdf files if they would be useful in debugging.
> 
> I don't understand the details of netCDF too well. I tried to rebuild
> with the line in posixio.c "#define INSTRUMENT" activated, in addition
> to compiling with -g and without the usual -DNDEBUG preprocessor
> definition, but got stymied by
> a lack of "instr.h".
> 
> Have you ever had any other reports of this kind of corruption, or
> suggestions as to what I might try to find the problem ?
> 
> Thanks in advance,
> -Phil S.

We've had a couple of reports of data corruption in netCDF files that
were reopened and then put into "redef" mode.  Even though it doesn't
sound like you're doing that, the workaround we came up with might prove
helpful.  See the following URL-s for a description of the problem and
the workaround:

    http://www.unidata.ucar.edu/glimpse/netcdf/3265
    http://www.unidata.ucar.edu/glimpse/netcdf/3293
    http://www.unidata.ucar.edu/glimpse/netcdf/3295

Please let us know if the workaround solves your problem.

--------
Steve Emmerson   <http://www.unidata.ucar.edu>