[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: netCDF parallel on T3E



> I did take a look at seeing if "global I/O" would work with 3.3.1
> or 3.4 on the NERSC T3E today.
>
> I used the "ncdftest.F" parallel test that I extracted from an LLNL
> physics code a couple years ago, and I think I may have previously forwarded
> to Unidata in one of my updates.  On multiple PEs, it either deadlocks in:
>
>  Beginning of Traceback (PE 0):
>   Interrupt at address 0x8001282fc in routine '_shmem_swap'.
>   Called from line 772 (address 0x8000c13b4) in routine '_glio_set_lock'.
>   Called from line 1261 (address 0x8000beee8) in routine
'_par_get_pe_dirty_page'.
>   Called from line 870 (address 0x8000bd264) in routine '_glob_flush'.
>   Called from line 43 (address 0x8000b4d88) in routine 'ffflush'.
>   Called from line 298 (address 0x80000d194) in routine 'ncio_ffio_sync'.
>   Called from line 363 (address 0x80000a8f0) in routine 'write_NC'.
>   Called from line 790 (address 0x80000b310) in routine 'NC_endef'.
>   Called from line 959 (address 0x80000bb24) in routine 'nc_enddef'.
>   Called from line 214 (address 0x800039d54) in routine 'ncendef'.
>   Called from line 388 (address 0x8000462d8) in routine 'c_ncendf'.
>   Called from line 394 (address 0x80004639c) in routine 'NCENDF'.
>   Called from line 88 (address 0x80000183c) in routine 'NCDFTEST'.
>   Called from line 475 (address 0x800000c98) in routine '$START$'.
>  End of Traceback.
>
> (with setenv NETCDF_FFIOSPEC global.privpos)
>
> or core dumps like:
>
> SIGNAL: Operand range error ( [0] memory management fault)
>
>  Beginning of Traceback (PE 1):
>   Interrupt at address 0x80008c270 in routine 'memcpy'.
>   Called from line 2002 (address 0x80001ba84) in routine
'ncx_putn_schar_schar'.
>   Called from line 1201 (address 0x80005dd84) in routine 'ncx_put_NC'.
>   Called from line 877 (address 0x800011ac0) in routine 'nc__create'.
>   Called from line 900 (address 0x800011c88) in routine 'nc_create'.
>   Called from line 153 (address 0x80005e8fc) in routine 'nccreate'.
>   Called from line 265 (address 0x800076844) in routine 'c_nccre'.
>   Called from line 281 (address 0x800076bc0) in routine 'NCCRE'.
>   Called from line 66 (address 0x800001588) in routine 'NCDFTEST'.
>   Called from line 475 (address 0x800000c98) in routine '$START$'.
>  End of Traceback.
> Operand range error(coredump)
>
> with other NETCDF_FFIOSPEC "global" settings.
>
> The above sorts of glitches can take quite a bit of time to sort
> out, particularly if there is strange race condition that netCDF
> has helped expose (possibly the first example) or something
> is getting overwritten (possibly the second).
>
> Note, both "3.3.1" and "3.4" show the same symptoms with "Global I/O".

I can't tell from the information presented whether this
NCENDF is the 'initial' end definition after a create, or
an end definition after redefinition. The second case is more complex.

This points out an issue which I failed to point out in my previous discussion.
While in 'define mode' (between nc_create() or nc_redef() and nc_enddef())
The entire in memory netcdf structure (struct NC) becomes read-write.
At other times (most of the time) it is read-only except for the 'numrecs'
field which I discussed before.
So, there should be exclusive (read-write) locks on the structure for
the whole definition sequence (nc_create() or nc_redef() thru nc_enddef()).
Finer grained locking would probably be possible, but more trouble than
it is worth.

I don't know exactly why this would have worked in netcdf-2 but fails now.
I would say that it was probably 'luck' that it worked in netcdf-2, since
the general situation described in the previous paragraph was true there
as well.

The I/O which occurs in netcdf redefinition (nc_enddef() after nc_redef())
is quite different in netcdf-3 than in netcdf-2.
*Note* Redefinition should be avoided wherever possible.
It almost always forces a copy of the entire file.
In netcdf-2, a redef call opened a new file to copy into, and
'unlinked' the old. In netcdf-3, the copy is in place, like a file based
'memmove()'.

Hope this helps.

-glenn