[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

960930: nctest segmentation violation (signal 11) in libsrc/xdrffio.c



Len,

> >To: address@hidden
> >cc: address@hidden,
> >cc: address@hidden
> >From: "Len Makin, CSIRO DIT Melbourne." <address@hidden>
> >Subject: netcdf test error - Cray xdr ffio(?)
> >Organization: CSIRO
> >Keywords: 199610010325.AA14947 netCDF2.4.3 Cray YMP4E/464

In the above message you wrote:

>       I have just installed netcdf-2.4.3 on our
> Cray Y-MP4E/464 running UNICOS 8.0.4, Programming Environment 2.0. 
> No errors occurred during make all or make install, however during make test,
> I got the message:
>         ./nctest
> *** Testing nccreate ...        Unable to open file: testfile.nc
> Error Explanation:%H?@**v+!@
> ok ***
> *** Testing ncopen ...          Unable to open file: tooth-fairy.nc
> Error Explanation:Make: "./nctest" terminated due to signal 11 (core dumped)
> 
> Stop.
> 
> Although the first test is "ok ***", the message bothered me.
> Recompiling in src/nctest and src/libsrc with -g instead of -O3 as CFLAGS 
> gave the same messages, but allowed me to track what was happening in a bit
> more detail.  The source of the unreadable "Error Explanation:%H?@*^B*v+^A!@" 
> (I presume it's supposed to be readable?)is src/libsrc/xdrffio.c 
> in the section
>          /* SWANSON
>          * for error message processing, we need some errors
>          * already described for us
>          */
>         extern char     *_fdc_errlist[];
> .....<code omitted>...............................................
>         if( fd == -1 ) {
>         /* SWANSON
>          * we have an error on the open. There are many possible
>          * reasons, but the primary cause we've seen is being out of
>          * memory. If we try to use the nc_serror routine
>          * issue this error message, it too will run out
>          * of memory, and the message will be garbled, if
>          * issued at all.
>          *
>          * we will try to issue the message directly to
>          * file handle 2 (stderr)
>          *
>          */
>                 write(2,mess1,strlen(mess1));
>                 write(2,path,strlen(path));
>                 write(2,mess2,1); /* end of line */
>                 write(2,mess3,strlen(mess3)); /* start of line */
>                 write(2,_fdc_errlist[stat.sw_error - 5000],
>                         strlen(_fdc_errlist[stat.sw_error - 5000]));
>                 write(2,mess2,1); /* end of line */
>                 /* nc_serror("filename \"%s\"", path) ; */
>                 return (-1);
>         }
> Function "NCxdrfile_create": calling parameters
>   xdrs:             0741321 -> (Structure)
>   path:             0362326 -> testfile.nc
>   ncmode:           15
> Local variables:
> ......<some omitted>
> stat:             (Structure)
> Field                              Type                             Value
> sw_flag                            unsigned:1                       1
> sw_error                           unsigned:31                      17
> ...<some omitted>....
> So the value of stat.sw_error just before the message (line 357) is 17
> Subtracting 5000 will do negative indexing into the message catalog. ???
> The operand range error (signal 11) after the second test is due to much the 
> same problem . The error occurs in the call to strlen. Total view shows
> Function "strlen": calling parameters
>   No parameters.
> called from NCxdrfile_create in xdrffio.c
> whichcat -l ldr shows message catalogs in
> /opt/ctl/cf90/cf90/nls/En
> /opt/ctl/nls/En
> /opt/ctl/craytools/craytools/nls/En
> /opt/ctl/craylibs/craylibs/nls/En
> /usr/lib/nls/En
> /lib/nls/En
> /lib/nls/En/ldr.cat
> 
> Cheers,
>       Len
> address@hidden:+61 3 9282 2622:CSIRO Supercomputing Facility
>       723 Swanston St., Carlton 3053, Victoria, AUSTRALIA 

You're absolutely right: the index for the variable `_fdc_errlist' is
completely bogus.  Apparently, the size of that string array changed
between releases of UNICOS.

I've since reverted to an earlier version for error reporting (with some
modifications).  I suggest that you apply the enclosed patch to the file
`libsrc/xdrffio.c'

Please let me know if this helps.

--------
Steve Emmerson   <address@hidden>


--------Begin patch
Index: xdrffio.c
===================================================================
RCS file: /upc/share/CVS/netcdf/libsrc/xdrffio.c,v
retrieving revision 2.7
diff -c -r2.7 xdrffio.c
*** 2.7 1996/06/27 19:38:48
--- xdrffio.c   1996/09/26 18:13:30
***************
*** 32,38 ****
  
  
  #include <stdio.h>
! 
  #include <stdlib.h>
  #include <unistd.h>
  #include <fcntl.h>
--- 32,38 ----
  
  
  #include <stdio.h>
! #include <errno.h>
  #include <stdlib.h>
  #include <unistd.h>
  #include <fcntl.h>
***************
*** 271,280 ****
   * strings
   */
  
- static char   *mess1 = "Unable to open file: ";
- static char   *mess2 = "\n";
- static char   *mess3 = "Error Explanation:";
- 
  int
  NCxdrfile_create(xdrs, path, ncmode)
        XDR *xdrs ;
--- 271,276 ----
***************
*** 335,341 ****
        /* wait for all PEs to complete open, as the FFIO layer will */
        barrier();
  #else
!       fd = ffopens(path, fmode, 0666, 0, &stat, ControlString) ;
  #endif
        if( fd == -1 ) {
        /* SWANSON
--- 331,337 ----
        /* wait for all PEs to complete open, as the FFIO layer will */
        barrier();
  #else
!       fd = ffopens(path, fmode, 0666, 0, (struct ffsw*)NULL, ControlString) ;
  #endif
        if( fd == -1 ) {
        /* SWANSON
***************
*** 348,364 ****
         *
         * we will try to issue the message directly to
         * file handle 2 (stderr)
-        *
         */
!               write(2,mess1,strlen(mess1));
!               write(2,path,strlen(path));
!               write(2,mess2,1); /* end of line */
!               write(2,mess3,strlen(mess3)); /* start of line */
!               write(2,_fdc_errlist[stat.sw_error - 5000],
!                       strlen(_fdc_errlist[stat.sw_error - 5000]));
!               write(2,mess2,1); /* end of line */
!               /* nc_serror("filename \"%s\"", path) ; */
                return (-1);
        }
  
        if( ncmode & NC_CREAT ) {
--- 344,382 ----
         *
         * we will try to issue the message directly to
         * file handle 2 (stderr)
         */
! #if 1
!          /*
!           * I don't see it (nc_serror() doesn't allocate memory).  Also,
!           * Swanson's method seems to cause a segmentation violation in
!           * some circumstances.  --Steve Emmerson 1996-08-26
!           */
!           nc_serror("filename \"%s\"", path) ;
! #else
!          /*
!           *
!           * I rewrote the method to elminate the segmentation violation
!           * but have not yet decided to return to Swanson's method.
!           * --Steve Emmerson 1996-09-26
!           */
!           if( ncopts & NC_VERBOSE )
!           {
!                 char    *colon_space = ": ";
!                 char    *filename_space_quote = "filename \"";
!                 char    *quote_colon_space = "\": ";
!                 char    *newline = "\n";
! 
!                 write(2,cdf_routine_name,strlen(cdf_routine_name));
!                 write(2,colon_space,strlen(colon_space));
!                 write(2,filename_space_quote,strlen(filename_space_quote));
!                 write(2,path,strlen(path));
!                 write(2,quote_colon_space,strlen(quote_colon_space));
!                 write(2,strerror(errno),strlen(strerror(errno)));
!                 write(2,newline,strlen(newline));
! 
                return (-1);
+           }
+ #endif
        }
  
        if( ncmode & NC_CREAT ) {