Re: [netcdfgroup] status of thread safety

>  Unfortunately HDF5's thread safe config is officially mutually
> exclusive with the HDF5 HL API used by NetCDF.

I was unaware of this. My understanding was that the thread-safe
HDf5 operated by providing a global lock to serialize all
accesses.  Assuming I am correct, you seem to be saying that this
lock is not used at the HL API level.

To date WRT netcdf, there have been two notions of thread-safe:
1. Allow multiple threads to operate as long as they are operating
   on different files.
2. Allow multiple threads to operate on the same file.

#1 is doable -- just time consuming to implement.
In fact I have a netcdf-c branch that should allow this for
netcdf 3 (classic) files. The approach is to isolate all mutable
global state used by the library and surround operations on that
state (both read and write) with a mutex lock. Since none of the
state accesses are all that long, this should not affect
performance very much. Note that an implicit assumption is that
all c-library calls (esp. malloc) are or can be made thread-safe.

This approach might also work for netcdf-4 files
except that we are limited by what the HDF5 library does.
If there API is globally serialized, then our locking regime
will not help.

There is no obvious reason AFIAK why the HDF5 library could not
be modified to do a similar isolation of global state. Note this
issue crops up for the pnetcdf library also.

#2 is much harder and would require significant refactoring of
any library that attempted it. The reason is that access to EVERY
piece of state (global or not) must be made thread-safe.

Finally, note that this issue is largely independent of parallel IO
using e.g. MPIO.

I look forward to further discussion of this issue; especially
any complication I might be overlooking.

=Dennis Heimbigner
 Unidata


On 7/18/2016 12:24 PM, Burlen Loring wrote:
Hi All,

Just wanted to voice concern about the status of thread safety in NetCDF
4 HDF5. The locking strategy we've successfully used with NetCDF classic
is not sufficient for NetCDF 4 with HDF5. In addition to our locking
strategy HDF5 needs to be compiled with a thread safe option.
Unfortunately HDF5's thread safe config is officially mutually exclusive
with the HDF5 HL API used by NetCDF. When HDF5 is forced to compile with
thread safety and HDF5 HL API, our threaded code runs without issue. It
also performs well, which is important. My concern is the fact that we
now rely upon a build configuration that is officially unsupported by HDF5.

Given the continual evolution to many core architectures, the horrendous
latency on modern parallel file systems on super computing platforms,
and that we have to deal with datasets structured such that latency is a
major issue, threading is ever more critical. It's really important that
we have a viable path to thread safety that is officially supported by
HDF5 and performant. We don't want to be facing problems down the road
due to use of the unsupported HDF5 config. Using the unsupported config
creates a deployment issue as we'd like to rely on HDF5 installed at HPC
centers or in official Linux distros, neither of whom will likely be
compiling HDF5 in an unsupported configuration. I also believe that for
the best performance locking is better done at the lowest level where it
can be fine grained, hence locking all NetCDF I/O in our application is
undesirable.

I'm hoping that this conversation can be a data point that people are
using threads to speed processing of large datasets on parallel file
systems. It's important for us to have an officially supported thread
safe option for NetCDF 4 HDF5 format.

Burlen



_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are
recorded in the Unidata inquiry tracking system and made publicly
available through the web.  Users who post to any of the lists we
maintain are reminded to remove any personal information that they
do not want to be made public.


netcdfgroup mailing list
netcdfgroup@xxxxxxxxxxxxxxxx
For list information or to unsubscribe,  visit:
http://www.unidata.ucar.edu/mailing_lists/



  • 2016 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: