Due to the current gap in continued funding from the U.S. National Science Foundation (NSF), the NSF Unidata Program Center has temporarily paused most operations. See NSF Unidata Pause in Most Operations for details.
Hi, Heiko Parallel I/O to the classical netCDF format is supported by netCDF through PnetCDF underneath. It allows you to write concurrently to a single shared file from multiple MPI processes. Of course, you will have to build PnetCDF first and then build netCDF with --enable-pnetcdf configure option. Your netCDF program does not need much changes to make use this feature. All you have to do is the followings. 1. call nc_create_par() instead of nc_create() 2. add NC_PNETCDF to the create mode argument of nc_create_par 3. call nc_var_par_access(ncid, varid, NC_COLLECTIVE) after nc_enddef to enable collective I/O mode There are a couple example codes available in this URL. http://cucis.ece.northwestern.edu/projects/PnetCDF/#InteroperabilityWithNetCDF4 There are instructions in each example file for building netCDF with PnetCDF. For downloading PnetCDF, please see http://cucis.ece.northwestern.edu/projects/PnetCDF/download.html Wei-keng On Sep 21, 2015, at 9:14 AM, Heiko Klein wrote: > Hi Nick, > > yes, they are all writing to the same file - we want to have one file at > the end. > > I've been scanning through the source-code of netcdf3. I guess the > problem of the partly written sections is caused by the translation of > the nc_put_vara calls to internal pages, and the from the internal pages > to disk. And eventually, the internal pages are not aligned with my > nc_put_vara calls, so even when the region of nc_put_vara doesn't > overlap between concurrent calls, the internal pages do? Is there a way > to enforce proper alignment? I see nc__enddef has several align parameters. > > > I'm aware that concurrent writes are not officially supported by the > netcdf-library. But IT-infrastructure has changed a lot since the start > of the netcdf-library and systems are nowadays highly parallelized, both > on CPU and also in IO/filesystems. I'm trying to find a way to allow for > simple parallelization. Having many output-files from a model is risky > for data-consistency - so I would like to avoid it without sacrificing > to much speed. > > Best regards, > > Heiko > > > On 2015-09-21 15:18, Nick Papior wrote: >> So, are they writing to the same files? >> >> I.e. job1 writes a(:,1) to test.nc <http://test.nc> and job2 writes >> a(:,2) to test.nc <http://test.nc>? >> Because that is not allowed. >> >> 2015-09-21 15:13 GMT+02:00 Heiko Klein <Heiko.Klein@xxxxxx >> <mailto:Heiko.Klein@xxxxxx>>: >> >> Hi, >> >> I'm trying to convert about 90GB of NWP data 4 times daily from grib to >> netcdf. The grib-files arrive as fast as the data can be downloaded from >> the HPC machines. They come by 10 files/forecast timestep. >> >> Currently, I manage to convert 1 file/forecast timestep and I would like >> to parallelize the conversion into independent jobs (i.e. neither MPI or >> OpenMP), with a theoretical performance increase of 10. The underlying >> IO system is fast enough to handle 10 jobs, and I have enough CPUs, but >> the concurrently written netcdf-files show data which is only written >> half to the disk, or mixed with other slices. >> >> What I do is create a _FILL_VALUE 'template' file, containing all >> definitions before the NWP job runs. When a new set of files arrives, >> the data is put to the respective data-slices which don't have any >> overlap, there is never a redefine, only functions like: nc_put_vara_* >> with different slices. >> >> Since the nc_put_vara_* calls are non-overlapping, I hoped that this >> type of concurrent write would work - but it doesn't. Is my idea really >> so bad to write data in parallel (e.g. there are internal buffers which >> are rewritten)? Any ideas how to improve the conversion process? >> >> Best regards, >> >> Heiko >> >> _______________________________________________ >> netcdfgroup mailing list >> netcdfgroup@xxxxxxxxxxxxxxxx <mailto:netcdfgroup@xxxxxxxxxxxxxxxx> >> For list information or to unsubscribe, visit: >> http://www.unidata.ucar.edu/mailing_lists/ >> >> >> >> >> -- >> Kind regards Nick > > -- > Dr. Heiko Klein Norwegian Meteorological Institute > Tel. + 47 22 96 32 58 P.O. Box 43 Blindern > http://www.met.no 0313 Oslo NORWAY > > _______________________________________________ > netcdfgroup mailing list > netcdfgroup@xxxxxxxxxxxxxxxx > For list information or to unsubscribe, visit: > http://www.unidata.ucar.edu/mailing_lists/
netcdfgroup
archives: