[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: HDF speed on netCDF API



> Organization: NOAA/PMEL
> Keywords: 199403182111.AA02131

Hi Steve,

> This is probably old news ... just FYI in case its not.
> 
> We are porting FERRET to Solaris and decided it would be more convenient
> to use netCDF as embedded in HDF since we had to install HDF anyway.
> 
> When we ran our benchmark it turned out that HDF has modified netCDF such
> that it is MANY TIMES SLOWER doing write operations.  We didn't test
> extensively.  (We just installed Unidata netCDF and relinked - much better
> now!)

Interesting, but it turns out that a performance comparison between the
HDF-encoded and XDR-encoded versions of the netCDF API is more complicated.
I think Chris Houck did a fairly reasonable job of tuning the HDF
implementation, and for some operations his code is faster.  I ran the
"nctime" benchmark available in
ftp://ftp.unidata.ucar.edu/pub/netcdf/nctime.c on HDF3.3r2 and netCDF 2.3.2
on a SPARCstation 10 running Solaris 5.3.  I've appended the results, which
write and read various hyperslabs of a 10x20x30x40 variable of each type:
byte, char, short, long, float, and double.

The very first write takes a long time (at least for our implementation)
because it writes fill values for all the variables.  I've included the HDF
times, netCDF times, and the ratio of these times, so any numbers > 1.0 in
the ratio column indicate netCDF was faster by that factor.  The results
show that our netCDF implementation that uses XDR is anywhere from 50 times
faster to 20 times slower, depending on the operation.  These benchmark
times somewhat vary from run to run, so they aren't terribly accurate, but
they show places where both implementations could probably be tuned for
better performance.  These results may be highly platform dependent and may
vary widely with other sizes of variables; I haven't tried "nctime" on
anything else.  Also, I think there may be a more recent version of HDF out
by now.

--Russ

                                    HDF           netCDF       HDF/netCDF       
   
----- byte_var(10,20,30,40)                                            
time for ncvarput 10x20x30x40     56.667 msec   1230.000 msec      0.05
time for ncvarget 1x1x1x1          0.661 msec      0.032 msec     20.66
time for ncvarget 10x1x1x1         5.882 msec      3.939 msec      1.49
time for ncvarget 1x20x1x1        13.333 msec      1.538 msec      8.67
time for ncvarget 1x1x30x1        16.667 msec      0.331 msec     50.35
time for ncvarget 1x1x1x40         0.775 msec      0.032 msec     24.22
time for ncvarget 10x20x1x1       83.333 msec     15.556 msec      5.36
time for ncvarget 10x1x30x1      123.333 msec      6.471 msec     19.06
time for ncvarget 10x1x1x40        5.882 msec      3.939 msec      1.49
time for ncvarget 1x20x30x1      250.000 msec      7.059 msec     35.42
time for ncvarget 1x20x1x40       12.222 msec      1.473 msec      8.30
time for ncvarget 1x1x30x40        0.853 msec      0.049 msec     17.41
time for ncvarget 10x20x30x1    2443.333 msec     56.667 msec     43.12
time for ncvarget 10x20x1x40      86.667 msec     15.556 msec      5.57
time for ncvarget 10x1x30x40       7.059 msec      4.545 msec      1.55
time for ncvarget 1x20x30x40       3.333 msec      1.846 msec      1.81
time for ncvarget 10x20x30x40     26.000 msec     20.000 msec      1.30
                                                                       
----- char_var(10,20,30,40)                                            
time for ncvarput 10x20x30x40     53.333 msec     40.000 msec      1.33
time for ncvarget 1x1x1x1          0.389 msec      0.032 msec     12.16
time for ncvarget 10x1x1x1         4.848 msec      3.939 msec      1.23
time for ncvarget 1x20x1x1        11.176 msec      1.846 msec      6.05
time for ncvarget 1x1x30x1        15.556 msec      1.085 msec     14.34
time for ncvarget 1x1x1x40         0.661 msec      0.029 msec     22.79
time for ncvarget 10x20x1x1       66.667 msec     16.667 msec      4.00
time for ncvarget 10x1x30x1       90.000 msec      7.647 msec     11.77
time for ncvarget 10x1x1x40        5.152 msec      3.636 msec      1.42
time for ncvarget 1x20x30x1      226.667 msec      8.235 msec     27.52
time for ncvarget 1x20x1x40       11.111 msec      1.846 msec      6.02
time for ncvarget 1x1x30x40        0.775 msec      0.930 msec      0.83
time for ncvarget 10x20x30x1    1793.333 msec     50.000 msec     35.87
time for ncvarget 10x20x1x40      63.333 msec     17.778 msec      3.56
time for ncvarget 10x1x30x40       5.455 msec      5.152 msec      1.06
time for ncvarget 1x20x30x40       3.333 msec      2.154 msec      1.55
time for ncvarget 10x20x30x40     24.000 msec     20.000 msec      1.20
                                                                       
----- short_var(10,20,30,40)                                           
time for ncvarput 10x20x30x40    120.000 msec    110.000 msec      1.09
time for ncvarget 1x1x1x1          0.545 msec      0.032 msec     17.03
time for ncvarget 10x1x1x1         5.455 msec      3.939 msec      1.38
time for ncvarget 1x20x1x1         8.824 msec      3.030 msec      2.91
time for ncvarget 1x1x30x1        11.111 msec      1.085 msec     10.24
time for ncvarget 1x1x1x40         0.545 msec      0.044 msec     12.39
time for ncvarget 10x20x1x1       60.000 msec     26.000 msec      2.31
time for ncvarget 10x1x30x1       93.333 msec      7.647 msec     12.21
time for ncvarget 10x1x1x40        5.152 msec      4.242 msec      1.21
time for ncvarget 1x20x30x1      176.667 msec      8.824 msec     20.02
time for ncvarget 1x20x1x40        8.824 msec      3.030 msec      2.91
time for ncvarget 1x1x30x40        0.775 msec      1.163 msec      0.67
time for ncvarget 10x20x30x1    1756.667 msec     63.333 msec     27.74
time for ncvarget 10x20x1x40      63.333 msec     28.000 msec      2.26
time for ncvarget 10x1x30x40       7.647 msec      8.235 msec      0.93
time for ncvarget 1x20x30x40       5.455 msec     10.000 msec      0.55
time for ncvarget 10x20x30x40     40.000 msec     73.333 msec      0.55
                                                                       
----- long_var(10,20,30,40)                                            
time for ncvarput 10x20x30x40    203.333 msec    530.000 msec      0.38
time for ncvarget 1x1x1x1          0.775 msec      0.029 msec     26.72
time for ncvarget 10x1x1x1         5.882 msec      3.939 msec      1.49
time for ncvarget 1x20x1x1        11.176 msec      4.848 msec      2.31
time for ncvarget 1x1x30x1        12.222 msec      1.085 msec     11.26
time for ncvarget 1x1x1x40         0.545 msec      0.107 msec      5.09
time for ncvarget 10x20x1x1       93.333 msec     40.000 msec      2.33
time for ncvarget 10x1x30x1      126.667 msec      8.824 msec     14.35
time for ncvarget 10x1x1x40        6.471 msec      4.848 msec      1.33
time for ncvarget 1x20x30x1      186.667 msec     10.000 msec     18.67
time for ncvarget 1x20x1x40       10.000 msec      5.882 msec      1.70
time for ncvarget 1x1x30x40        0.930 msec      3.333 msec      0.28
time for ncvarget 10x20x30x1    2403.333 msec     80.000 msec     30.04
time for ncvarget 10x20x1x40      90.000 msec     50.000 msec      1.80
time for ncvarget 10x1x30x40      10.000 msec     24.000 msec      0.42
time for ncvarget 1x20x30x40      10.588 msec     40.000 msec      0.26
time for ncvarget 10x20x30x40    100.000 msec    356.667 msec      0.28
                                                                       
----- float_var(10,20,30,40)                                           
time for ncvarput 10x20x30x40    226.667 msec    536.667 msec      0.42
time for ncvarget 1x1x1x1          0.661 msec      0.029 msec     22.79
time for ncvarget 10x1x1x1         7.059 msec      3.939 msec      1.79
time for ncvarget 1x20x1x1        11.111 msec      4.848 msec      2.29
time for ncvarget 1x1x30x1        17.778 msec      1.008 msec     17.64
time for ncvarget 1x1x1x40         0.700 msec      0.107 msec      6.54
time for ncvarget 10x20x1x1       86.667 msec     36.667 msec      2.36
time for ncvarget 10x1x30x1      123.333 msec      8.824 msec     13.98
time for ncvarget 10x1x1x40        5.882 msec      4.848 msec      1.21
time for ncvarget 1x20x30x1      253.333 msec     10.588 msec     23.93
time for ncvarget 1x20x1x40       11.111 msec      6.471 msec      1.72
time for ncvarget 1x1x30x40        1.163 msec      3.333 msec      0.35
time for ncvarget 10x20x30x1    2496.667 msec     76.667 msec     32.57
time for ncvarget 10x20x1x40      86.667 msec     50.000 msec      1.73
time for ncvarget 10x1x30x40      11.111 msec     24.000 msec      0.46
time for ncvarget 1x20x30x40      11.111 msec     36.667 msec      0.30
time for ncvarget 10x20x30x40     96.667 msec    356.667 msec      0.27
                                                                       
----- double_var(10,20,30,40)                                          
time for ncvarput 10x20x30x40    436.667 msec   1060.000 msec      0.41
time for ncvarget 1x1x1x1          0.545 msec      0.032 msec     17.03
time for ncvarget 10x1x1x1         5.758 msec      3.939 msec      1.46
time for ncvarget 1x20x1x1        13.333 msec      7.647 msec      1.74
time for ncvarget 1x1x30x1        16.667 msec      1.085 msec     15.36
time for ncvarget 1x1x1x40         0.775 msec      0.176 msec      4.40
time for ncvarget 10x20x1x1       83.333 msec     66.667 msec      1.25
time for ncvarget 10x1x30x1      103.333 msec     11.111 msec      9.30
time for ncvarget 10x1x1x40        6.061 msec      5.882 msec      1.03
time for ncvarget 1x20x30x1      260.000 msec     14.444 msec     18.00
time for ncvarget 1x20x1x40       13.333 msec     10.000 msec      1.33
time for ncvarget 1x1x30x40        1.692 msec      5.455 msec      0.31
time for ncvarget 10x20x30x1    1943.333 msec    130.000 msec     14.95
time for ncvarget 10x20x1x40      83.333 msec     86.667 msec      0.96
time for ncvarget 10x1x30x40      14.444 msec     36.667 msec      0.39
time for ncvarget 1x20x30x40      22.000 msec     70.000 msec      0.31
time for ncvarget 10x20x30x40    166.667 msec    686.667 msec      0.24