Due to the current gap in continued funding from the U.S. National Science Foundation (NSF), the NSF Unidata Program Center has temporarily paused most operations. See NSF Unidata Pause in Most Operations for details.
Hi, I thought I'd give netcdf 4.7.4 a try for the compression in parallel IO (using hdf5 1.10.7, pnetcdf 1.9.0, netcdf-fortran-4.5.3) on a NOAA cluster. I've been using intel 19 with mvapich2.3, which worked fine with earlier versions (4.3.something). So the problem I have is that it works fine on a single node, but get various failures when trying to run a job that uses 2 or more nodes. It also fails if the IO is not parallel (standard netcdf-4 where each process writes its data in turn). I have also compiled everything (including cloud model code) using Intel MPI, which fails promptly with a seg fault when it tries to run on 2 nodes. (Here, I am comparing 4 or 9 threads on a single node or 16 threads split on 2 nodes. If I force the 16 thread version to run on a single node, it runs fine.) The problem seems to be reproducible with a simple write/read test adapted from ftst_parallel.F, so it is seems not specific to my model code. Fails with both pnetcdf and mpiio Any ideas what could be the issue here? I am stumped. -- Ted
netcdfgroup
archives: