[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 951213: More netCDF-2.4-beta5 test results



>From: address@hidden (John Sheldon)
>Organization: GFDL
>Keywords: 199512121934.AA29804 netCDF CRAY

John,

Jeff Kuehn (address@hidden), who developed the Cray optimizations for netCDF
2.4, replied with some helpful answers I'm forwarding to you in this note.
He also generously offered to answer your questions directly (thanks Jeff!):

    I'm willing to help.  Have him contact me and CC you on the email with
    future questions.

Anyway, here's Jeff's explanation of how to configure the Cray netCDF I/O.
I'll be incorporating this explanation into the new netCDF User's Guide.

    > 3.   Are the benchmarks done by nctime too artificial or variable to be
    >      useful?  It takes a four-dimensional slab of specified size and
    >      times writing it all out with ncvarput as well as reading back in
    >      all 16 kinds of cross sections.  It does this for all six types
    >      of netCDF data.  Previously I've noted that unless a local file
    >      system is used, NFS caching may get in the way of consistent
    >      results.  Results also may be very dependent on sizes used and
    >      may vary from run to run for other reasons that are difficult to
    >      control.

    They may be.  The default buffering scheme used in the cray
    optimizations depends on async I/O and a small number of buffers.  While
    this performs pretty well in most cases, to get really good performance,
    one should tune the buffering scheme and the cache configuration to suit
    the application.

    So long as the pre-fill option still does all it's I/O using either the
    putbytes() or putlong() routines for the XDR in question, all the I/O
    should be passed through the buffers.  It would probably be a bad thing
    for the prefill to not use the putbytes() or putlong() routines, given
    the buffering which the library is currently performing.

    The default buffering scheme favors sequential writes, but the cache is
    very configurable.  He might try some of the following:

            setenv NETCDF_FFIOSPEC bufa:336:2

            (2 336-block asynchronous buffers... ie double buffering)
            (this is the default configuration and works well for 
            sequential accesses)


            setenv NETCDF_FFIOSPEC cache:256:8:2

            (8 256-block pages with a read-ahead factor of 2 blocks)
            (larger random accesses)


            setenv NETCDF_FFIOSPEC cachea:256:8:2

            (8 256-block pages with a read-ahead factor of 2 blocks, asynch)
            (larger random accesses)

            setenv NETCDF_FFIOSPEC cachea:8:256:0

            (256 8-block pages without read-ahead, asynch)
            (many smaller pages w/o read-ahead for more random accesses
            as typified by netcdf slicing arrays)


            setenv NETCDF_FFIOSPEC cache:8:256:0,cachea.sds:1024:4:1

            (hold onto your hat:  this is a two layer cache.  the first
            (synchronous) layer is composed of 256 8-block pages in memory,
            the second (asynchronous) layer is composed of 4 1024-block pages
            on the ssd.  this scheme works well when accesses precess
            through the file in random waves roughly 2x1024-blocks wide.)


    All of the options/configurations supported in CRI's FFIO library are
    usable through netcdf.  I'd also recommend that he look at CRI's I/O
    optimization guide for info on using FFIO to it's fullest.  This is 
    also compatible with CRI's EIE I/O library.  When you're ready to do
    documentation for the next netcdf, we should put together some pointers
    to the cray optimization guides to insert in your docs.

I'll be trying some of this when I test the 2.4beta5 version on NCAR's Cray.

I would be interested in being CC:ed on your exchanges with Jeff, unless you
guys want to talk about me behind my back :-).

--Russ