[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: problems using netCDF functions on SOLARIS



Hi Willa,

> We are experencing this netCDF problem on solaris when trying to write
> out netCDF file on a cross mounted disk:  
> 
>       case 1:   after a couple of loops of attribute and
>                 variable writings, get this message:
> 
>                       ncendef: NC_dcpy
> 
>       case 2:   0 filled variable arrays.
> 
> Both cases don't happen every time (case 1: 9/10 times, case 2: 3/5
> times).  I cannot reproduce the problem by few netCDF function calls
> (we are using EPIC library which calling the netCDF), it seems like the
> failure depends on the size of the data sets and the combination of
> netCDF calls.
> 
> However, I solved the case 1 problem and reduced percentage of case 2
> problem by adding ncsync() call after each ncendef().  Based on
> the netCDF manual, it says "data is automatically synchronized to disk
> when a netCDF file is closed, or whenever you leave defined mode".  So
> there is a discrepancy if I have to add ncsync() call after each
> ncendef() call in order to fix the problem we have. 
> 
> Well, still the case 2 problem happens about 1/4 times.  Do you think
> I should added ncsync() after each ncvarput() call too?
> 
> Unfortunately, I was not able to make a simple program to reproduce the
> problem for you to test.  Again, this problem only happens on solaris
> when I trying to write to a cross mounted disk.  (No problem if I run
> it on SunOS to write to the same cross mounted disk, nor if I run it on
> both Solaris and SunOS to write to the local disk.)  
> 
> Any suggestion, comments and foresight about the problem are helpful to
> us.  Thanks.

You are the only installation to have reported this problem under Solaris.
We use netCDF on Solaris 2.3 here on cross-mounted disks and have never seen
this problem.  I just tested it again on a Solaris 2.3 platform by running
the "nctest" program in the netcdf/nctest/ directory of the source
distribution on a remotely mounted disk (to another Solaris server) and all
the tests succeeded.  Here's the output of "uname -a", "df .", and the test
output:

    buddy% uname -a
    SunOS buddy.unidata.ucar.edu 5.3 Generic_101318-70 sun4m sparc
    buddy% df .
    Filesystem            kbytes    used   avail capacity  Mounted on
    zero:/export/home    2042101 1788454   49437    97%    /a/zero/home
    buddy% make test
    ./nctest
    *** Testing nccreate ...    ok ***
    *** Testing ncopen ...      ok ***
    *** Testing ncredef ...     ok ***
    *** Testing ncendef ...     ok ***
    *** Testing ncclose ...     ok ***
    *** Testing ncinquire ...   ok ***
    *** Testing ncsync ...      ok ***
    *** Testing ncabort ...     ok ***
    *** Testing ncdimdef ...    ok ***
    *** Testing ncdimid ...     ok ***
    *** Testing ncdiminq ...    ok ***
    *** Testing ncdimrename ... ok ***
    *** Testing ncvardef ...    ok ***
    *** Testing ncvarid ...     ok ***
    *** Testing ncvarinq ...    ok ***
    *** Testing ncvarput1 ...   ok ***
    *** Testing ncvarget1 ...   ok ***
    *** Testing ncvarput ...    ok ***
    *** Testing ncvarget ...    ok ***
    *** Testing ncvarputg ...   ok ***
    *** Testing ncvargetg ...   ok ***
    *** Testing ncrecinq ...    ok ***
    *** Testing ncrecput ...    ok ***
    *** Testing ncrecget ...    ok ***
    *** Testing ncvarrename ... ok ***
    *** Testing ncattput ...    ok ***
    *** Testing ncattinq ...    ok ***
    *** Testing ncattget ...    ok ***
    *** Testing ncattcopy ...   ok ***
    *** Testing ncattname ...   ok ***
    *** Testing ncattrename ... ok ***
    *** Testing ncattdel ...    ok ***
    *** Testing nctypelen ...   ok ***

The above test also worked fine when software built on either a Solaris 2.3
system or a Solaris 2.4 system was run on a Solaris 2.4 system on a remotely
mounted disk.  This test executes and tests the results of many ncvarput()
calls without ncsync() calls, so it should test the situation you have
described.

Can you run this same test on a remotely-mounted disk and see if it works?

This sounds like it might be a symptom of a bug in your Solaris OS software,
and all I can suggest is that you check which patches for your version of
Solaris are available that apply to NFS, apply them, and try again.  I know
our system administrator applied lots of Sun-supplied patches to our Solaris
OS when it was installed.  The fact that the bug is intermittent provides
further evidence that it is an NFS problem rather than a netCDF problem.

--Russ

______________________________________________________________________________

Russ Rew                                           UCAR Unidata Program
address@hidden                              http://www.unidata.ucar.edu