Re: some performance benchmarks for netcdf-4

Hi Ed,

> Quincey Koziol <koziol@xxxxxxxxxxxxx> writes:
> 
> > Hi Ed,
> >     Interesting...  I'm curious about why HDF5 is slower for both tests
> > below.  Can you send me your benchmarks?  I'll take a look at them and see
> > if there are any ways to speed them up...
> 
> I will be posting my first netcdf-4 release package (hopefully!)
> sometime today. Then you can run these on your machine.
    Cool. :-)

> HDF5 is slower in tests 1 and 5 because I am using BE on a LE machine
> for that. (To try and get a head-to-head comparison with netcdf-3 which
> always uses BE.)
    Well, I was thinking that HDF5 should be at least as fast as netCDF-3,
for the apples-to-apples comparison you made.

> For the netcdf-4/HDF5 file I'm using NATIVE types, which, predictably,
> yield a faster performance. I'm sure if I switched tests 1 and 5 to
> NATIVE formats, we would see a similar gain in performance.
> 
> I have to iron out a bunch of release issues (which will take
> several weeks, as we've just decided on a major restructuring of the
> codebase to combine the netcdf-3 and netcdf-4 cvs repositories). Once
> that is complete, I'll take another iteration through this timing
> program to add some more tests and learn some more things.
> 
> (Any additional timing tests you might like to propose would be
> welcome. I'm at least going to add some floating point stuff. The
> times below also don't represent the different ways we're going to
> allow the user to specify chunking algorithms.)
> 
> The goal at this time was just to ensure that netcdf-4 was not such a
> dog that it would never work for anyone. At least we know that it is
> in the same ballpark as netcdf-3.
    ;-)

    Quincey

> 
> Ed
> 
> > 
> >         Quincey
> > 
> > > Howdy all!
> > > 
> > > Russ suggested to me yesterday that I post these timing results. In
> > > the tests below I write, and then read, 4 files. The first is in HDF5,
> > > with no netCDF stuff at all. (Writing BE on a LE system, because
> > > that's what netcdf does). The second file is pure netcdf-3, with no
> > > netcdf-4 code involved. The third uses the netcdf-4 library to write a
> > > file in netcdf classic format. Finally, the last file is created with
> > > netcdf-4, with HDF5 as a storage layer. This is faster because it is
> > > writing LE on a LE system (i.e. using HDF "native" format).
> > > 
> > > The CPU time is the combined total of time spent by the CPU on
> > > user/library code, and the User time is wall clock time.
> > > 
> > > Russ would like to include some of these results in an AMS paper, but
> > > I think we all need to give a thought to what we are measuring first.
> > > 
> > > About to write pure HDF5 file, one dataset, record by record: x 2000 y 
> > > 300 z 500...
> > > avg CPU time =   15.17 secs.
> > > avg User Time = 43 secs.
> > > 
> > > About to write pure netcdf-3 file, record by record...
> > > avg CPU time =   17.15 secs.
> > > avg User Time = 45 secs.
> > > 
> > > About to write netcdf-3 file thru netcdf-4, record by record...
> > > avg CPU time =   17.26 secs.
> > > avg User Time = 45 secs.
> > > 
> > > About to write netcdf-4 (i.e. HDF5) file, record by record...
> > > avg CPU time =   15.87 secs.
> > > avg User Time = 36 secs.
> > > 
> > > About to read pure HDF5 file...
> > > avg CPU time =   13.70 secs.
> > > avg User Time = 34 secs.
> > > 
> > > About to read pure netcdf-3 file...
> > > avg CPU time =    9.18 secs.
> > > avg User Time = 29 secs.
> > > 
> > > About to read netcdf-3 file, created with netcdf-4...
> > > avg CPU time =   11.64 secs.
> > > avg User Time = 20 secs.
> > > 
> > > About to read netcdf-4 (i.e. HDF5) file...
> > > avg CPU time =   12.49 secs.
> > > avg User Time = 20 secs.
> > > 
>