[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #NVI-857322]: Re: [netcdfgroup] Wondering about when NetCDF data hits the disk...



Hi Thomas,

I've forwarded this issue to our individual support system for a
while, in hopes that we can resolve it with you and then summarize
the resolution for the many subscribers to the netcdfgroup mailing
list.  I'm also Cc:ing Rob Ross because of his interest and
contributions to the discussion.

For now, I've looked at the netCDF-3 library code (which is also used
in netCDF-4 when dealing with classic or 64-bit offset netCDF files)
and determined where to insert an fsync() call for testing with your
NFS scenario.

To use it, you will need to get a snapshot release (or real release)
after today and rebuild it, using the new --enable-fsync option to
configure, along with whatever other configure options you use.

When rebuilt, this will cause fsync() to be called with the
appropriate file descriptor from the nc_sync() function
unconditionally, after it has called NC_sync() and another internal
synchronization function ncio_px_sync().  This is done
unconditionally, because as far as I can determine there is no flag
for when there is dirty data, only dirty header metadata or a dirty
record counter.  But nc_sync() is not called a lot internally, so I
think this will be OK.

But that's what I need you to help me determine.  If, at your
convenience, you could build and link with this new library, I would
be interested in whether you can

 1. call nc_sync(ncid) in the writing process to do what you wanted
    nc_fsync() for, and whether that seems to help in keeping the
    process's data flushed to disk on an NFS server.

 2. determine if there is a resulting performance penalty, and if so
    whether you consider it acceptable or unacceptable.

In other words, I just want to know whether this improves the
symptoms of the concurrency problem and if so whether the cost in
performance is reasonable.

Thanks again for bringing this issue to our attention and for any time
you can devote to helping resolve this.

P.S.:  Rob, let me know if you don't want to see these ...

--Russ

> Am Wed, 28 Oct 2009 10:43:42 -0600
> schrieb Russ Rew <address@hidden>:
> 
> > Just out of curiosity, did you try using the NF90_SHARE flag on the
> > create/open calls in the writing and reading processes to see if that
> > makes any difference?
> 
> I did not and it's already quite late or me here, but I am assured
> that it would not make a difference. The NF90_SHARE semantics are
> fine for processes on one machine. Without considering multiple
> machines accessing one file system (be it NFS or whatnot), it does
> not matter if the file is "written to disk"... since everyone shares
> the same filesystem buffers.
> 
> > I will have to do some
> > research to see whether avoiding fsync() was entirely a portability
> > issue for use on non-Unix systems, or whether there was some other
> > intention.
> 
> > But for now, I agree that it looks like calling nc_sync()
> > ought to also call fsync() on the underlying file descriptor.  If
> > nc_sync() did that, there would be no need for a new nc_fsync()
> > function.
> 
> Having the fiasco of Firefox' madness with its huge databases for
> trivial stuff like bookmarks in mind, we should think before making
> fsync() default on every write, or, rather on every explicit call to
> nc_sync(). The latter should be sane, IMHO, but one might like to be
> aware of funny effects like
> https://bugzilla.mozilla.org/show_bug.cgi?id=421482 .  But, well,
> causing disk I/O on a call to a data I/O library feels rather sane
> and is not as unexpected as causing disk I/O on every character
> typed into a URL field, or a click on a hyperlink...
> 
> > Thanks for bringing this issue to our attention.
> 
> Oh, glad to have helped to get you another issue to worry about;-)
>
>
>
> Alrighty then,
>
> Thomas.
Russ Rew                                         UCAR Unidata Program
address@hidden                     http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: NVI-857322
Department: Support netCDF
Priority: Normal
Status: Closed