[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 980507: bad netCDF 3.4 problems on winterpark address@hidden, address@hidden



Charle:

In regard to your netcdf problem report:

> >To: netCDF Support <address@hidden>,
> >To: Ed Arnold <address@hidden>,
> >To: Andrei Rodionov <address@hidden>,
> >To: Brian Eaton <address@hidden>,
> >To: Charlie Zender <address@hidden>,
> >To: Dennis Shea <address@hidden>,
> >To: Matthew Hecht <address@hidden>
> >From: Charlie Zender <address@hidden>
> >Subject: bad netCDF 3.4 problems on winterpark
> >Organization: .
> >Keywords: 199805072123.PAA08776
>
> Hi all,
>
> My netCDF operators are showing symptoms of a netCDF/IRIX bug with
> versions 3.4a and 3.4 of the netCDF library. The problems first
> appeared on Winterpark yesterday. Unfortunately, the symptoms do not
> include a core dump, so users are getting incorrect behavior without
> any warning. This, of course, is the worst kind of bug, and people
> have probably already begun to disseminate corrupt data because of
> it.
>
> The following output shows the result of using ncks two consecutive
> times. The last line of the output differs ("18.4746" vs. "0").
> This particular behavior indicates the last record of hyperslabs is
> occassionally returned full of zeros instead of the correct data.
> The same command has never shown an incorrect result when tried with
> CGD Solaris and SCD Unicos ncks executables.
>
> We saw this type of behavior with NCO and netCDF 3.3.1 on Winterpark
> three months ago, and we thought installing the 3.4a library fixed
> the problem. So this is either a new, but similar problem (NFS
> related?), or the old problem was never completely fixed.
>
> Thanks,
> Charlie
>
> Following commands executed on Winterpark. Note that you may have to
> execute these commands multiple times before you see the bug, i.e.,
> the last line of ncks output changes. The output is preceded by the
> date, location of the executable, and executable and netCDF version.
> The netCDF 3.4 library I am linking to is in /fs/local/[lib64/include64]
>
> /fs/cgd/home0/zender/tmp: date;which ncks;ncks -r;ncks -d time,236,239 -O
/fs/cgd/data0/hecht/diag/g015.00/ocndiag_mean_y0000-y0059.nc foo.nc;ncks -F -v
T_horz -d basins,1 -d z_t,1 -H -C foo.nc
> Thu May  7 14:52:11 MDT 1998
> /fs/cgd/home0/zender/bin/SGI64/ncks
> ncks 3.26 (1997/09/17) Copyright 1995--1998 University Corporation for
Atmospheric Research
> Linked to netCDF library version 3.4 of May  7 1998 13:40:54 $
> z_t(1)=625 basins(1) time(1)=21625 T_horz(1)=18.6071
> z_t(1)=625 basins(1) time(2)=21716 T_horz(2)=18.1028
> z_t(1)=625 basins(1) time(3)=21808 T_horz(3)=18.0295
> z_t(1)=625 basins(1) time(4)=21900 T_horz(4)=0
>
> /fs/cgd/home0/zender/tmp: date ; which ncks ; ncks -r ; ncks -d time,236,239
-O /fs/cgd/data0/hecht/diag/g015.00/ocndiag_mean_y0000-y0059.nc foo.nc ; ncks
-F -v T_horz -d basins,1 -d z_t,1 -H -C foo.nc
> Thu May  7 14:52:35 MDT 1998
> /fs/cgd/home0/zender/bin/SGI64/ncks
> ncks 3.26 (1997/09/17) Copyright 1995--1998 University Corporation for
Atmospheric Research
> Linked to netCDF library version 3.4 of May  7 1998 13:40:54 $
> z_t(1)=625 basins(1) time(1)=21625 T_horz(1)=18.6071
> z_t(1)=625 basins(1) time(2)=21716 T_horz(2)=18.1028
> z_t(1)=625 basins(1) time(3)=21808 T_horz(3)=18.0295
> z_t(1)=625 basins(1) time(4)=21900 T_horz(4)=18.4746
>
>
> --
> Charlie Zender      Voice: (303) 497-1612, FAX: 497-1324
> NCAR ASP & CGD     E-mail: address@hidden
> P.O. Box 3000         URL: http://www.cgd.ucar.edu/cms/zender
> Boulder CO 80307-3000 PGP: finger -l address@hidden


It would be helpful for us to know the following.
(Just because we work for the same company, you can't assume
we have access to these machines.)
Most of the info can be obtained from the unix command 'uname'.

- Processing run on winterpark SGI Power Challenge XL
        OS release number? NFS patches applied? IRIX64 or IRIX?

- Reading netcdf files via NFS? Mounted from middlepark SGI Challenge?
        OS release number? NFS patches applied? IRIX64 or IRIX?

- Mount is NFS version 3 or nfs version 2?
        Mount flags?

Just for reference, the symptoms described do _not_ match the problem
fixed at 3.3.2 with 6.2 and NFS, initially reported
address@hidden (Ethan Alpert) 199709221820.MAA25471,
which you also reported in February as 199802122028.NAA22485 and
Russ explained. As you recall, that problem was limited to the
opening (creating) the file.

Not to rule out problems with netcdf,
we have reports of NFS interoperability problems (independent of netcdf)
with some versions of IRIX 6. I believe the problems involved NFS version 3.
For reference, here are the patches we have installed on an IRIX64 6.2
machine. Your mileage may vary.

(binnie) 1106 % showprods | grep patch | grep nfs
I  patchSG0001615.nfs_sw  09/26/97  NFS Software
I  patchSG0001615.nfs_sw.nfs  09/26/97  NFS Support
I  patchSG0002611.nfs_man  03/25/98  NFS Documentation
I  patchSG0002611.nfs_man.nfs  03/25/98  NFS Support Manual Pages
I  patchSG0002611.nfs_sw  03/25/98  NFS Software
I  patchSG0002611.nfs_sw.nis  03/25/98  NIS (formerly Yellow Pages) Support
I  patchSG0002654.nfs3_sw  03/25/98  NFS Version 3 Software
I  patchSG0002654.nfs3_sw.nfs3  03/25/98  NFS Version 3 Support
I  patchSG0002654.nfs_sw  03/25/98  NFS Software
I  patchSG0002654.nfs_sw.nfs  03/25/98  NFS Support

-glenn