Elena
At 11:39 AM -0400 4/26/07, David Stuebe wrote:
Hi NETCDF folks
I work on an unstructured finite volume coastal ocean model,
FVCOM, which is parallel (using MPICH2). The Read Write is a major
slow down for our large cases. On our cluster, we have one large
storage device, an emc raid array. The network is infini-band -
the network is much faster than the raid array.
For our model we need to read large initial condition data sets,
and single frames of forcing data while running. We also need to
write single frames of data for output (frequently), and large
restart files (less frequently).
I am considering two options for recoding the IO from the model.
One is based around the future F90 netcdf 4 parallel interface
which would allow a symmetric code- every processor does the same
thing. The other option is to use netcdf 3, let the master
processor read/write the data and distribute it to each node, -an
asymmetric coding.
What I need to know- are netcdf 4 parallel IO operations blocking?
The problem - the order of cells and nodes in our data set does
not allow for a simple start, count read format. A data array
might have dimensions (time,layers,cells). As an example, in a 2
processor case with 8 cells, proc1 has cells(1 2 5 7) while proc2
has cells (3 4 6 8) - write operations would have to be in a do
loop to write each cell individually from the processor that owns it.
For a model with 300,000 cells on 30 processors, this would be
10,000 calls to NF90_PUT_VAR on each processor. Even if the calls
are non-blocking this seems dangerous.
Any thoughts?
David
--
------------------------------------------------------------
Elena Pourmal
The HDF Group
1901 So First ST.
Suite C-2
Champaign, IL 61820
epourmal@xxxxxxxxxxxx
(217)333-0238 (office)
(217)333-9049 (fax)
------------------------------------------------------------
======================================================================
=========
To unsubscribe netcdf-hdf, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
======================================================================
=========