[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: netcdf4 parallel IO
Title: Re: netcdf4 parallel IO
David,
We probably confused you with our answers :-) Let me reiterate
what Quincey said in his email.
Bottom line is that HDF5 uses MPI_File_read(write)_* and
does not use MPI_File_iread(iwrite)_* functions.
I/O for HDF5 metadata is always a collective call; raw data can
be read/written using noncollective calls (also called
"independent" in our docs) or collective calls.
HDF5 parallel performance depends on many factors and we
will be happy to work with you, Ed and LLNL's Visit team to get the
best from the parallel NetCDF-4.
Elena
At 1:48 PM -0400 4/27/07, David Stuebe wrote:
Wow!
Thank you all for the help with Netcdf4. I am really excited about the
great response. I have to do some work on my end to look at the state
of the F90 interface - I will look at the NETCDF 4 C and F90 interface
and the HDF5v1.8beta to see what needs to be done and where these
tools will be useful in the FVCOM model.
My questions about blocking during IO operations have to do with my
very vague understanding of what exactly happens during a parallel
read/write call. The problem I want to solve is that at the moment,
using netcdf 3 to write output for a given timestep, we have to
collect all the data on the master node, and then all the nodes have
to wait for the master to catch up after it finishes writing the data.
It seems the best solution might be to have a dedicated node for
writing data in compressed netcdf4 format, while using parallel reads
for forcing data?
One area that I have focused on in my work for FVCOM is parallel
visualization of our unstructured grid data using LLNL's VisIt. I will
have to talk to those folks and see what there plans are for
supporting netcdf4/HDF5 parallel IO. It sounds like a great
application for parallel reads of compressed data!
David
On 4/27/07, Quincey Koziol <koziol@xxxxxxxxxxxx>
wrote:
On Apr 26, 2007, at 12:03 PM, Elena Pourmal wrote:
> David,
>
> NetcDF-4 is built on top of HDF5 that uses blocking MPI IO
calls.
> We are thinking of implementing non-blocking calls for the
HDF5
> metadata writes.
It depends whether
David is concerned about blocking during metadata
operations (which HDF5 does right now), or during dataset I/O
(which
the application can choose whether to use collective or
independent
parallel I/O, neither of which is "blocking" in the same
sense)...
Quincey
>
> Elena
>
> At 11:39 AM -0400 4/26/07, David Stuebe wrote:
>> Hi NETCDF folks
>>
>> I work on an unstructured finite volume coastal ocean
model,
>> FVCOM, which is parallel (using MPICH2). The Read Write is a
major
>> slow down for our large cases. On our cluster, we have one
large
>> storage device, an emc raid array. The network is infini-band
-
>> the network is much faster than the raid array.
>>
>> For our model we need to read large initial condition data
sets,
>> and single frames of forcing data while running. We also need
to
>> write single frames of data for output (frequently), and
large
>> restart files (less frequently).
>>
>> I am considering two options for recoding the IO from the
model.
>> One is based around the future F90 netcdf 4 parallel
interface
>> which would allow a symmetric code- every processor does the
same
>> thing. The other option is to use netcdf 3, let the
master
>> processor read/write the data and distribute it to each node,
-an
>> asymmetric coding.
>>
>> What I need to know- are netcdf 4 parallel IO
operations blocking?
>>
>> The problem - the order of cells and nodes in our data set
does
>> not allow for a simple start, count read format. A data
array
>> might have dimensions (time,layers,cells). As an example,
in a 2
>> processor case with 8 cells, proc1 has cells(1 2 5 7) while
proc2
>> has cells (3 4 6 8) - write operations would have to be in a
do
>> loop to write each cell individually from the processor that
owns it.
>>
>> For a model with 300,000 cells on 30 processors, this would
be
>> 10,000 calls to NF90_PUT_VAR on each processor. Even if the
calls
>> are non-blocking this seems dangerous.
>>
>> Any thoughts?
>>
>> David
>
>
> --
>
> ------------------------------------------------------------
> Elena Pourmal
> The HDF Group
> 1901 So First ST.
> Suite C-2
> Champaign, IL 61820
>
>
epourmal@xxxxxxxxxxxx
> (217)333-0238 (office)
> (217)333-9049 (fax)
> ------------------------------------------------------------
>
>
======================================================================
> =========
> To unsubscribe netcdf-hdf, visit:
> http://www.unidata.ucar.edu/mailing-list-delete-form.html
>
======================================================================
> =========
>
>
--
------------------------------------------------------------
Elena Pourmal
The HDF Group
1901 So First ST.
Suite C-2
Champaign, IL 61820
epourmal@xxxxxxxxxxxx
(217)333-0238 (office)
(217)333-9049 (fax)
------------------------------------------------------------