[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

use of netcdf in parallel environment



> From address@hidden Fri Feb  4 10:41:36 1994
> Keywords: 199402041741.AA05224
> Date: Fri, 4 Feb 94 09:44:41 -0800
> From: address@hidden (John bolstad)
> To: address@hidden
> Subject: Re: Art Mirin's questions
> 
> Dear Messrs. Davis and Rew,
> 
>    This is a followup to some of the questions asked you by my
> colleague Art Mirin.  Thanks for your helpful advice.  Following it,
> we rewrote our netCDF applications for parallel writes.  For our
> applications thread = processor, i.e., one (user) process per
> processor.  And all processors of interest do not use shared memory,
> only distributed memory.  For parallel writes we now have all the
> slaves send messages to the master, then have the master processor
> write everything.  It works well, and is portable.  We can still do
> parallel reads directly with netCDF, although on one machine we run
> into a race condition on opening a file.  But that is fairly easy
> to fix.  (Before this, we were trying to do parallel writes.  We
> got into trouble on one machine with the 'filling' mechanism, which
> is done at different times for files containing record variables,
> or only non-record ones.  Even if it worked on some machines, it was
> not a portable solution.)
> 
>    So even though we are using netCDF for a purpose for which (I think)
> it was never intended, we are very satisfied with the results.  We have
> run it on five or six distributed memory parallel computers, and have not
> encountered any bugs.  (This may seem like faint praise, but we are
> unable to say the same thing about compilers or even simpler Unix
> software like cpp!  Furthermore, there is a local home-brew competitor
> to netCDF which only runs on two machines and has some things which are
> either bugs or hard-to-understand features.)  The documentation is good
> too.  We can exchange
> data with many others in the meteorological community.  Finally, it is
> truly portable; we can make runs on one computer, and use the restart
> dumps (in netCDF format) to continue the run on another computer with 
> a different number of processors, or even a Cray using a single
> processor.  The relatively recent -b option in ncdump is especially
> useful for reading the dumps.  Another nice thing is not having to
> worry about conflicts with Fortran unit numbers.  (Maybe that's an
> argument for abandoning Fortrash entirely, but we are not in a position
> to do that.)
> 
>    John Bolstad
>    address@hidden