Re: netcdf4 parallel IO

NOTE: The netcdf-hdf mailing list is no longer active. The list archives are made available for historical reasons.

To: netcdf-hdf@xxxxxxxxxxxxxxxx
Subject: Re: netcdf4 parallel IO
From: John Caron <caron@xxxxxxxxxxxxxxxx>
Date: Mon, 30 Apr 2007 18:02:05 -0600

Hi David:

On an unstructured model, what determines the order that the elements have to 
be written to disk?

If you have 300,000 cells on 30 processors, could the first 10,000 cells be 
from processor 1, the next 10,000 be from processor 2, etc, or do these have to 
be interleaved on the output file?

John

David Stuebe wrote:

Hi NETCDF folks
I work on an unstructured finite volume coastal ocean model, FVCOM,which is parallel (using MPICH2). The Read Write is a major slow downfor our large cases. On our cluster, we have one large storage device,an emc raid array. The network is infini-band - the network is muchfaster than the raid array.
For our model we need to read large initial condition data sets, andsingle frames of forcing data while running. We also need to writesingle frames of data for output (frequently), and large restart files(less frequently).
I am considering two options for recoding the IO from the model. One isbased around the future F90 netcdf 4 parallel interface which wouldallow a symmetric code- every processor does the same thing. The otheroption is to use netcdf 3, let the master processor read/write the dataand distribute it to each node, -an asymmetric coding.
What I need to know-  are netcdf 4 parallel IO operations blocking?
The problem - the order of cells and nodes in our data set does notallow for a simple start, count read format. A data array might havedimensions (time,layers,cells). As an example, in a 2 processor casewith 8 cells, proc1 has cells(1 2 5 7) while proc2 has cells (3 4 6 8) -write operations would have to be in a do loop to write each cellindividually from the processor that owns it.
For a model with 300,000 cells on 30 processors, this would be 10,000calls to NF90_PUT_VAR on each processor. Even if the calls arenon-blocking this seems dangerous.
Any thoughts?

David


==============================================================================
To unsubscribe netcdf-hdf, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================

References:
- netcdf4 parallel IO
  - From: David Stuebe

2007 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdf-hdf archives: