[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #JHA-604767]: Have an environment variable to change the block size returned by blksize()



Hi Gier,

> The netcdf function blksize() should be modified to allow the user to
> override the default block size. Our systems have filesystems where a
> non-standard (larger) block size yields better I/O bandwidth.

Do you happen to know why fstat() on your systems doesn't return
sb.st_blksize to be a block size that yields better I/O bandwidth?
According to the man page for fstat():

   The st_blksize field gives the "preferred" blocksize for efficient
   file system I/O.  (Writing to a file in smaller chunks may cause
   an inefficient read-modify-rewrite.)

NetCDF assumes if the return from fstat has an st_blksize member that
it can be used.  Otherwise netCDF provides an argument to the
nc__create() or nc__open () calls (note the double underbar in the
function names) for the user to override the standard buffer size with
the bufrsizehintp argument.

Currently you are probably using the libsrc/posixio.c code for the
nc__create() and nc__open() functions.  Earlier Cray-specific code in
the netCDF library used libsrc/ffio.c, which has this code in the
blksize() function:

#ifdef __crayx1
                if(sb.st_blksize > 0)
                        return (size_t) sb.st_blksize;
#else
                if(sb.st_oblksize > 0)
                        return (size_t) sb.st_oblksize;
#endif

from an fffcntl(fd, FC_STAT, &sb, &sw) call.  I'm not sure what the
sb.st_oblksize member was for or whether it provided a more optimium
block size for Cray buffered I/O, but if you can point me to
documentation for this, I'd be interested.

> Below is an example of a locally modified blksize() to allow the user
> to set MY_BLKSIZE to override the default.
> 
> static size_t
> blksize(int fd)
> {
> size_t my_blksize;
> char *enval;
> my_blksize = 8192;
> enval = getenv ("MY_BLKSIZE");
> if (enval != (char *) NULL) {
> my_blksize = atol (enval);
> }
> #if defined(HAVE_ST_BLKSIZE)
> struct stat sb;
> if (fstat(fd, &sb) > -1)
> {
> if(sb.st_blksize >= my_blksize) // was 8192 -- medavis
> return (size_t) sb.st_blksize;
> return my_blksize; // was 8192
> }
> /* else, silent in the face of error */
> #endif
> return (size_t) my_blksize; // was 2 * pagesize() 
> }

Rather than use environment variables, we would prefer for the library
to use a good buffer size provided by the OS, preferebly through the
standard fstat() call, but through another system call if available,
even if it's Cray-specific.  Is it possible to get an optimum buffer
size for Cray systems through a system call?

--Russ

Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: JHA-604767
Department: Support netCDF
Priority: Normal
Status: Closed