[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 980327: netCDF , T3E and VPP Steve Luzmoor <address@hidden>, address@hidden (John Sheldon), address@hidden (Mark Reed), address@hidden, address@hidden (Venkatramani Balaji)



> To: address@hidden
> From: " (IPSL)" <address@hidden>
> Subject: netCDF , T3E and VPP
> Organization: .
> Keywords: 199803271739.KAA29100
>
> Hello,
>
> I'm working in France for climate modelling research.
>
> We plan to use netCDF to standardize outputs from different kinds of models.
>
> Models are running on :
> - - Cray C90, no problem with netCDF, of course.
> - - Fujitsu VPP300, no probleme with netCDF if we use just one processor
> - - Cray T3E, no problem with netCDF if we use just one processor
>
> My question is about parallelism.
>
> Is it possible to use netCDF to write one file
> from different processors.
> Documentation says no, is that really?
> even with NF_SHARE parameter in NF_CREATE and
> barrier and NF_SYNC before and after each write?
>
> Models are working on specific geographical region.
> For example, with 2 processors, one is for North and the second for
> South hemisphere. Each processor knows that je is alone
> to write this kind of data.
>
> If it's not possible, what about your plans.
> At Nersc, It is possible to use a parallel netCDF but 2.4...
>
> Thank you for your help.
> Best regards.
>
> Marie-Alice Foujols
>
> Tél: 01 44 27 61 70
> FAX: 01 44 27 61 71
> mailto:address@hidden
> http://www.ipsl.jussieu.fr/~mafoipsl
>
>                   Pôle de Modélisation de l'IPSL
>  Institut Pierre Simon Laplace des Sciences de l'Environnement Global
>         CNRS - UPMC - UVSQ - CEA - CNES - ORSTOM - ENS - X
>             CETP - LSCE - LMD - LODYC - LPCM - SA
>
> Adresse postale:
> Pôle de Modélisation de l'IPSL - Institut Pierre Simon Laplace
> Case 102, U.P.M.C., 4 place Jussieu, 75252 Paris Cedex 5, France
>
> Indispensable pour nous trouver a Jussieu:
> Tour 26 - 4ème étage - couloir 26-16 - portes 28 ou 30

Marie-Alice:

(I am sending this reply to other T3E netcdf users as well.)

In netcdf-3.4, we eliminated some problems that would interfere with parallel
execution, but the code is not completely "re-entrant", "thread-safe", or
safe for parallel execution as it stands. There are a few regions of code
that would need to be protected by sychronization primitives. We have not added
the necessary protection. The reason is that we have not seen how to do this in
a portable way. On a given (single) system, it should not be too difficult.

Netcdf-3 in general and netcdf-3.4 in particular should be safer for
parallel processing than any netcdf-2 version.

The critical sections are as follows. All the file references are in
src/libsrc.

1) In nc.c, there is a global linked list of top level netcdf data structures
called 'NC_list'. All the calls reference this list to translate between the
netcdf "id" and the data structure which is used internally. There needs to be
write (exclusive) lock protection in the functions add_to_NCList() and
del_from_NCList() to ensure that the list remains consistant.
This would only be a problem in programs that had different threads of
execution opening different files.

2) In the netcdf data structure, nc.h'struct NC', there is a member
'size_t numrecs'. This contains the current value of the most slowly
varying index for unlimited variables. In the current file formant,
all the unlimited variables share this number. It is "grow only".
Reference to this data needs to be synchronized.
Although references to this appear all over the place, I believe the access
that needs to be syncronized is in putget.m4, in the functions NCvnrecs() and
NCcoordck(). This could be a problem in programs which
        have unlimited variables
        AND the size of data set is growing (== most slowly varying index
                of unlimited variables is increasing)
        AND different threads of execution are causing the growth and/or
            referencing the newly added data.

3) Finally, there is issue of I/O. Most system file access libraries, like
UNIX read() and write(), have a notion of "current position" in the file.
This piece of "state" is usually hidden inside the "file descriptor", and
is a major barrier to writing parallel code. It must be an anachronism going
back to days when everything was on tape. On a CRAY T3E, the ffio spec
"global,privpos" may deal with this problem.

The netcdf library has an I/O layer which was designed to support parallel
access, ncio.h. The primitive operations, get() and rel(), can contain
primitives to properly sychronize access to various regions of a file.
The get() operation converts an offset and extent in a file to memory,
and would "lock" that region of the file until the corresponding call to rel().
Non-overlapping regions could be accessed in parallel.
All of the higher layers of netcdf use this layer and obey this protocol.
Unfortunately, neither of the the ncio implementations
we provide, posixio.c or ffio.c, provide the proper enforcement of this
semantic
in the implementation.

In summary, if you

        Do all open and close operations on a single PE, before and after
        the parallel sections

        Set the file size in the single processor initialization section.

        used "global,privpos"

it might work.

-glenn