[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Support #CBX-420369]: Ticket ID: FFI-171830: Slow performance problem



Hi Si,

> I sent a request to unidata 3 weeks ago
> with a slow performance problem on CISL's glade file system.
> 
> I want to know if any one is working on that problem or has any suggestions 
> or hints.
> 
> The ticket info is attached at the bottom.
> 
> Best wishes,
> Sincerely Si Liu
> CISL Consulting

Yes, sorry to have taken so long to respond.  I was out of town last
week and working on some other priorities before that.  We're
shorthanded until we can hire a new developer, and currently have over
70 open support tickets.

I haven't made much progress, except to eliminate some approaches I
thought might lead somewhere.

I got a copy of a 3.3 GB 64-bit offset netCDF test file with which
Gary Strand had demonstrated the performance problem.

Trying nccopy on it with the latest 4.2 snapshot didn't reveal
anything unusual.  It could copy and convert to any of the 4 format
variants on a Fedora system in less than 3 minutes.

  $ /usr/bin/time nccopy b.nc b2.nc
  11.62user 6.44system 2:37.95elapsed 11%CPU (0avgtext+0avgdata 
6852maxresident)k
  6464912inputs+6482112outputs (0major+1776minor)pagefaults 0swaps

Then I built a special version of nccopy that I could use for
simulating a file system with large file block sizes, by using a new
command-line option to nccopy as the bufrsizehint argument in
nc__open() and nc__create() calls when copying through the netCDF API.

These are the so-called "double underbar" versions of nc_open() and
nc_create() calls, that have some extra tuning parameters, including
an argument that specifies a file block size.

Use of this simulated large block size technique was adequate to
demonstrate and fix another problem with large file block systems last
year.

Here's the results of timing simulated block sizes from 8 KB to 4 MB
using the undocumented "-b" option I implemented for this test:

  $ clear_cache && /usr/bin/time nccopy -b 8K b.nc b2.nc
  4.64user 7.31system 2:54.42elapsed 6%CPU (0avgtext+0avgdata 7400maxresident)k
  6475960inputs+6621448outputs (37major+1876minor)pagefaults 0swaps
  $ clear_cache && /usr/bin/time nccopy -b 32K b.nc b2.nc
  4.77user 7.12system 2:35.54elapsed 7%CPU (0avgtext+0avgdata 7500maxresident)k
  6475768inputs+6754760outputs (37major+1901minor)pagefaults 0swaps
  $ clear_cache && /usr/bin/time nccopy -b 125K b.nc b2.nc
  4.64user 7.33system 2:28.22elapsed 8%CPU (0avgtext+0avgdata 7756maxresident)k
  6475904inputs+6893280outputs (37major+1992minor)pagefaults 0swaps
  $ clear_cache && /usr/bin/time nccopy -b 500K b.nc b2.nc
  4.16user 10.23system 2:16.89elapsed 10%CPU (0avgtext+0avgdata 
9076maxresident)k
  6475944inputs+7741808outputs (37major+2359minor)pagefaults 0swaps
  $ clear_cache && /usr/bin/time nccopy -b 2M b.nc b2.nc
  4.40user 25.77system 2:25.36elapsed 20%CPU (0avgtext+0avgdata 
14884maxresident)k
  6475648inputs+11853232outputs (37major+3822minor)pagefaults 0swaps
  $ clear_cache && /usr/bin/time nccopy -b 4M b.nc b2.nc
  4.63user 47.06system 4:23.00elapsed 19%CPU (0avgtext+0avgdata 
22804maxresident)k
  6475912inputs+25200224outputs (37major+5776minor)pagefaults 0swaps

The "clear_cache" command is a shell script that clears disk caches to
make timing more accurate:

  #!/bin/bash -x
  # Clear the disk caches.
  sync
  sudo sh -c "echo 3 > /proc/sys/vm/drop_caches"

But as you can see, this didn't work to reproduce the problem of huge
performance differences you were seeing on the Glade filesystem.
There's definitely an increase in time with larger file system blocks,
but that is expected when the block size gets larger than some of the
reads or writes that the netCDF library uses to access the input file
or write the output.

I can make the source for this experimental version of nccopy with the
"-b" option available, but I'm not sure it would be of any use in
diagnosing this problem.  

To make more progress, it would be useful to:

  - know if the nccopy program shows the same problems on the Glade
    file system you were seeing with nco programs
  - if so, get a trace of the actual low-level read(), write(), and
    lseek() calls made during such a copy, for comparison with what we
    see on a desktop Linux system.

I'll have only limited time to work on this during the next two weeks,
as I have to fly to a workshop next week to present 3 talks that I
haven't finished yet, and Im also involved in a journal paper review
with a deadline early in February.  I wish I could be of more help,
but I know very little about Lustre and similar large file block
systems ...

I'm CC:ing a few other people involved with this problem, in case
anyone else has progress or ideas to report.

--Russ

Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: CBX-420369
Department: Support netCDF
Priority: High
Status: Closed