[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: mmaped netcdf files



On Apr 10,  8:16am, Russ Rew wrote:
> Subject: Re: mmaped netcdf files
> > From: Brad Asztalos <address@hidden>

> > I wonder if to your knowledge anyone has
> > made the netcdf package support mmaping the
> > files for faster access?  If not I would
> > like to try this and of course supply the
> > code to you when it is done and tested.
> > Can you see any problems that might be encountered
> > with this approach?

In netcdf-3, all i/o goes thru the "ncio" interface.
The interface is defined by libsrc/ncio.h and the
usual implementation is posixio.c.

The interface is derived from something internal to
another package we have, the ldm product queue.
In that product, we use mmap below essentially the
same interface as ncio. So, the interface is natural
for mmap'ed access.

There are a few reasons we don't just ship and
support this.
1) One was that we didn't want to introduce
too much technology in one release, so that we could keep track
of where problems are creeping in. Since the netcdf-3 release is
about seven months late at this point, limiting new features remains
a good idea.
2) Some implementations of mmap() fail over some implementations of NFS.
Many netcdf users have no idea which files they are accessing via NFS and
which are local, so an implementation we could support would have to
auto-magically fail back to posix io on mmap EIO error. (You may not need
this.)
3) Mapping a file which may grow is a pain unless you are on an SGI and have
MAP_AUTOGROW.
4) Another, more subtle, issue has to do with concurrent access to netcdf
files.
There are hooks in the netcdf-3 interface for declaring access to be
shared and that locking should take place. This isn't implemented yet
because the current file format layout makes it tricky for certain operations.
This applies in spades for mapped files.

If you need higher performance than netcdf-3 provides and wish to
implement mmap'ed i/o, by all means go ahead. You should limit yourself to
the not SHARE'ed (MAP_PRIVATE) case for now, perhaps even just for read
access. It should be fairly straight forward.

Thank you for using netcdf.

-glenn