Unidata Developer's Blog

« Previous page | Main

Showing entries tagged [cache]

NetCDF-4 Horizontal Data Read Performance with Cache Clearing

03 January 2010

Here are my numbers for doing horizontal reads with different cache sizes.

The times are the time to read each horizontal size, reading all of them.

I realize that reading just one horizontal slice will give different (much higher) times. The reason is that when I read the first horizontal level the various caches along the way will start filling up with the following levels, and then when I read them I get very low times. So reading it this way allows the caching to work. Reading just one horizontal level and stopping the program (to clear cache), will result in the worst case scenario for the caching.

But what should I be optimizing for? Reading all horizontal levels? Or just reading one level?

cs[0]   cs[1]   cs[2]   cache(MB)       deflate shuffle read_hor(us)
0       0       0       0.0             0       0       1527
1       16      32      1.0             0       0       1577
1       16      128     1.0             0       0       1618
1       16      256     1.0             0       0       1515
1       64      32      1.0             0       0       1579
1       64      128     1.0             0       0       1586
1       64      256     1.0             0       0       1584
1       128     32      1.0             0       0       1593
1       128     128     1.0             0       0       1583
1       128     256     1.0             0       0       1571
10      16      32      1.0             0       0       2128
10      16      128     1.0             0       0       2520
10      16      256     1.0             0       0       4309
10      64      32      1.0             0       0       4083
10      64      128     1.0             0       0       1751
10      64      256     1.0             0       0       1713
10      128     32      1.0             0       0       1692
10      128     128     1.0             0       0       1862
10      128     256     1.0             0       0       1749
256     16      32      1.0             0       0       10594
256     16      128     1.0             0       0       3681
256     16      256     1.0             0       0       3074
256     64      32      1.0             0       0       3656
256     64      128     1.0             0       0       3042
256     64      256     1.0             0       0       2773
256     128     32      1.0             0       0       3828
256     128     128     1.0             0       0       2335
256     128     256     1.0             0       0       1581
1024    16      32      1.0             0       0       35622
1024    16      128     1.0             0       0       2759
1024    16      256     1.0             0       0       2912
1024    64      32      1.0             0       0       2875
1024    64      128     1.0             0       0       2868
1024    64      256     1.0             0       0       3816
1024    128     32      1.0             0       0       2780
1024    128     128     1.0             0       0       2558
1024    128     256     1.0             0       0       1628
1560    16      32      1.0             0       0       154450
1560    16      128     1.0             0       0       3063
1560    16      256     1.0             0       0       3700

Posted by $entry.creator.screenName

Email this

Effects of Clearing the Cache on Benchmarks

02 January 2010

How to win friends and influence benchmarks...

I note that I have a shell in my nc_test4 directory, clear_cache.sh. I have to sudo to run it, but when I do, it has a dramatic effect on the time that the time series read takes.

The following uses the new (not yet checked in) test program tst_ar4_3d.c, which seeks to set up a simpler proxy data file for the AR-4 tests. I want to show that a simpler file (but with the same-sized data variable) has similar performance to the slightly more dressed up pr_A1 file from AR-4 that I got from Gary. That's because my simpler file is easier to create in a test program.

bash-3.2$ ./tst_ar4_3d -h   cs[0] cs[1] cs[2] cache(MB) deflate shuffle read_hor(us) read_time_ser(us)  64    256   128   4.0       0       0       1420         2281847
  bash-3.2$ ./tst_ar4_3d -h   cs[0] cs[1] cs[2] cache(MB) deflate shuffle read_hor(us) read_time_ser(us)  64    256   128   4.0       0       0       81           3159
  bash-3.2$ ./tst_ar4_3d -h   cs[0] cs[1] cs[2] cache(MB) deflate shuffle read_hor(us) read_time_ser(us)  64    256   128   4.0       0       0       76           2983
  bash-3.2$ sudo bash clear_cache.sh   
bash-3.2$ ./tst_ar4_3d -h   cs[0] cs[1] cs[2] cache(MB) deflate shuffle read_hor(us) read_time_ser(us)  64    256   128   4.0       0       0       1410         2504315

Wow, what a difference a cleared cache makes!

Here's the clear_cache.sh script:

#!/bin/bash -x   # Clear the disk caches.

sync
echo 3 > /proc/sys/vm/drop_caches

Posted by $entry.creator.screenName

Email this

More Cache Size Benchmarks

31 December 2009

Why does increasing cache size slow down time series access so much?

bash-3.2$ ./tst_ar4 -h pr_A1_256_128_128.nc
cs[0] cs[1] cs[2] cache(MB) deflate shuffle read_hor(us) read_time_ser(us)
256   128   128   0.5       0       0       217          2773
256   128   128   1.0       0       0       214          1935
256   128   128   4.0       0       0       214          1929
256   128   128   32.0      0       0       160          84440
256   128   128   128.0     0       0       129          82407

Posted by $entry.creator.screenName

Email this

Does (Cache) Size Matter?

30 December 2009

Some cache size tests for netcdf-4 and ar4 data.

Oddly, increasing the cache here seems to hurt:

./tst_ar4 -h pr_A1_256_128_128.nc
cs[0]  cs[1]  cs[2]  cache(MB) deflate  shuffle  read_hor(us)  read_time_ser(us)
256    128    128    4         0        0        218           1611
256    128    128    16        0        0        9352          34872
256    128    128    32        0        0        134           32464
256    128    128    64        0        0        133           32303
256    128    128    128       0        0        146           12202

The best read time for the time series is a 4 MB chunk cache. Why?

Posted by $entry.creator.screenName