Re: netcdf4 parallel IO

NOTE: The netcdf-hdf mailing list is no longer active. The list archives are made available for historical reasons.

To: netcdf-hdf@xxxxxxxxxxxxxxxx
Subject: Re: netcdf4 parallel IO
From: Quincey Koziol <koziol@xxxxxxxxxxxx>
Date: Fri, 27 Apr 2007 09:03:21 -0500


On Apr 26, 2007, at 12:03 PM, Elena Pourmal wrote:

David,
NetcDF-4 is built on top of HDF5 that uses blocking MPI IO calls.We are thinking of implementing non-blocking calls for the HDF5metadata writes.

It depends whether David is concerned about blocking during metadataoperations (which HDF5 does right now), or during dataset I/O (whichthe application can choose whether to use collective or independentparallel I/O, neither of which is "blocking" in the same sense)...


        Quincey

Elena

At 11:39 AM -0400 4/26/07, David Stuebe wrote:
Hi NETCDF folks
I work on an unstructured finite volume coastal ocean model,FVCOM, which is parallel (using MPICH2). The Read Write is a majorslow down for our large cases. On our cluster, we have one largestorage device, an emc raid array. The network is infini-band -the network is much faster than the raid array.
For our model we need to read large initial condition data sets,and single frames of forcing data while running. We also need towrite single frames of data for output (frequently), and largerestart files (less frequently).
I am considering two options for recoding the IO from the model.One is based around the future F90 netcdf 4 parallel interfacewhich would allow a symmetric code- every processor does the samething. The other option is to use netcdf 3, let the masterprocessor read/write the data and distribute it to each node, -anasymmetric coding.
What I need to know-  are netcdf 4 parallel IO operations blocking?
The problem - the order of cells and nodes in our data set doesnot allow for a simple start, count read format. A data arraymight have dimensions (time,layers,cells). As an example, in a 2processor case with 8 cells, proc1 has cells(1 2 5 7) while proc2has cells (3 4 6 8) - write operations would have to be in a doloop to write each cell individually from the processor that owns it.
For a model with 300,000 cells on 30 processors, this would be10,000 calls to NF90_PUT_VAR on each processor. Even if the callsare non-blocking this seems dangerous.
Any thoughts?

David
--

------------------------------------------------------------
Elena Pourmal
The HDF Group
1901 So First ST.
Suite C-2
Champaign, IL 61820

epourmal@xxxxxxxxxxxx
(217)333-0238 (office)
(217)333-9049 (fax)
------------------------------------------------------------
==============================================================================
To unsubscribe netcdf-hdf, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Follow-Ups:
- Re: netcdf4 parallel IO
  - From: David Stuebe

References:
- netcdf4 parallel IO
  - From: David Stuebe
- Re: netcdf4 parallel IO
  - From: Elena Pourmal

2007 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdf-hdf archives: