[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #TCU-710461]: netcdf on lustre

Hi Jesse,

> It's netcdf 3.6.3.  The source for the program writing is attached.  All
> it's doing is taking a few netcdf files (48 in this case) and combining
> them.
> If we change the operation so that the written file is outside of the
> lustre file system, such as symbolically linking the output file to
> /dev/shm or even a local file system, the write occurs in less than 20
> seconds.  If the output file is on the lustre file system, it takes
> about 5 minutes.
> We do have parallel netcdf installed, but the scientists would have to
> link their models against it of course.  If the performance of parallel
> netcdf was sufficiently high it would be an easy sell.

It looks like this is exactly the problem first identified in nco software,
and later verified with nccopy, that's fixed in the latest release of nccopy 
and nco.  I'm assuming all the input files have a unlimited ("record") 
dimension, and that the merged output file does also.

The problem is that the program writes output variables a variable-at-a-time,
when it should instead be writing output records a record-at-a-time, with all
the variable data written for each record, before advancing to the next record.

Using the first strategy is slow when you have a lot of record variables and
large disk block size (such as on Lustre file systems).  That's because a disk
block is typically larger than a record's worth of data for one variable, so
writing a variable-at-a-time ends up rewriting the same disk block multiple
times, for each variable whose data is included in that disk block.

The problem and its solution are explained in more detail here, starting with
the seventh posting in the forum:


You have several options, depending on whether you need an unlimited dimension
in the output (merged) file:

  1.  If you don't need an unlimited dimension in the output, define the
      dimension corresponding to the unlimited dimensions in the input to
      be of fixed size (probably just the sum of the unlimited dimension
      sizes in the input).  Then writing a variable at a time will be fast.

  2.  If you still need an unlimited dimension in the output, change the 
      order of the nested loops and the start and count vectors so that the
      record dimension is the outside loop.  For each record, read a record's
      worth of data from all the input files that include data for that record
      and all associated record variables.  This will also greatly speed up
      the program on Lustre file systems, or any file systems with large disk
      block size.

  3.  Consider using a package such as NCO (or NCL or CDO) to do the data
      concatenation for you.  This problem is well-solved in the NCO 
      operators, may be solved in NCL, and I don't know about CDO.

I think rewriting your current WRF processing program to reorder the loops and
data writing wouldn't take too long, but you might be able to adapt an NCO 
such as ncrcat in even less time:



> On 05/11/2012 10:38 AM, Unidata netCDF Support wrote:
> > Hi Jesse,
> >
> >> I have an HPC cluster using lustre as our backend file systems. The
> >> cluster serves primarily weather models, such as the WRF and GFS.
> >>
> >> One thing we observed is that netcdf writes can often be very slow on
> >> lustre.  Do you have any recommended tuning procedures for netcdf on 
> >> lustre?
> >
> > No, sorry, we don't currently test on lustre.  However, if you have 
> > configured
> > lustre with a large disk block size and are writing netCDF files with lots 
> > of
> > records and lots of record variables (variables that use an unlimited 
> > dimension)
> > then you could be seeing a problem with writing such data a variable at at 
> > time
> > instead of a record at a time:
> >
> >    https://www.unidata.ucar.edu/jira/browse/NCF-142
> >
> > You haven't said what version of the library you're using, but the fix 
> > above is
> > in the nccopy utility in version 4.2, and in some of the utilities in the 
> > most
> > recent release of NCO (the NetCDF Operators software from UC Irvine).
> >
> > Also, are you using parallel I/O?  Use of parallel-netcdf may be a solution 
> > worth
> > looking at if you're writing classic-format files, or the HDF5-based 
> > parallel I/O
> > in netCDF-4 otherwise.
> >
> > If you have a small example that demonstrates the bad performance, we could 
> > try to
> > reproduce it and diagnose the problem.
> >
> > --Russ
> >
> > --Russ
> >
> > Russ Rew                                         UCAR Unidata Program
> > address@hidden                      http://www.unidata.ucar.edu
> >
> >
> >
> > Ticket Details
> > ===================
> > Ticket ID: TCU-710461
> > Department: Support netCDF
> > Priority: Normal
> > Status: Closed
> >
Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu

Ticket Details
Ticket ID: TCU-710461
Department: Support netCDF
Priority: Normal
Status: Closed

NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.