[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20010410: NetCDF: poor performance writing to an NFS volume



>To: "Russ Rew" <address@hidden>
>From: "Mark Hadfield" <address@hidden>
>Subject: NetCDF: poor performance writing to an NFS volume
>Organization: NIWA
>Keywords: 200104102239.f3AMddL27187

Mark,

Evidently my assumption that an equivalent test with ncgen and fill
values should perform the same as writing actual values was wrong.

> Perhaps you could help me interpret the following results...
> 
> Here are times running ncgen from the CDL file you sent me (I renamed it
> test2). The first test measures how long it takes ncgen to parse the file
> without any output.
> 
> TEST 1:
> 
> thor $ time ncgen test2.cdl
> 
> real    0m0.034s
> user    0m0.005s
> sys     0m0.003s
> 
> TEST 2:
> 
> thor $ time ncgen -o $HOME/tmp/test2.nc -b test2.cdl  # local
> 
> real    0m0.048s
> user    0m0.028s
> sys     0m0.019s
>
> TEST 3:
> 
> thor $ time ncgen -o $DMF_HOME/tmp/test2.nc -b test2.cdl  # NFS volume
> 
> real    0m2.446s
> user    0m0.026s
> sys     0m0.014s
> 
> TEST 2 and TEST 3 results are not too different from yours. (BTW $HOME/tmp
> is on a real disk, but I imagine the very fast writing involves caching),

Your Compaq Alpha is about 10 times faster than my Solaris Sparc on
Test1 and Test2, but otherwise the times are comparable.

>
> Here are the corresponding times for the original CDL file.
> 
> TEST 4:
> 
> thor $ time ncgen test.cdl
> 
> real    0m1.627s
> user    0m1.276s
> sys     0m0.015s
> 
> TEST 5:
> 
> thor $ time ncgen -o $HOME/tmp/test.nc -b test.cdl  # local
> 
> real    0m2.213s
> user    0m1.317s
> sys     0m0.125s
> 
> TEST 6:
> 
> thor $ time ncgen -o $DMF_HOME/tmp/test.nc -b test.cdl  # NFS volume
> 
> real    0m27.896s
> user    0m1.310s
> sys     0m0.100s
> 
> The file sizes are:
> 
> thor $ ls -l *.cdl *.nc
> -rwxr--r--   1 hadfield dynmet    3938281 Apr 12 01:33 test.cdl
> -rw-r--r--   1 hadfield dynmet    1347576 Apr 12 01:44 test.nc
> -rwxr--r--   1 hadfield dynmet      12849 Apr 12 01:31 test2.cdl
> -rw-r--r--   1 hadfield dynmet    1347576 Apr 12 01:38 test2.nc
> 
> The output files, test.nc and test2.nc, are the same size, but test2.nc is
> largely filled with fill values whereas test.nc is filled with real data.
> 
> I presume that the difference between TEST 5 and TEST2 results from the time
> needed to process the larger CDL file (cf TEST 4 vs TEST 1).
> 
> Why do you think TEST 6 takes much longer than TEST 3? When ncgen writes
> fill values to a file does it actually write those values, in the same way
> it writes real data?

I don't know, and am surprised by your results.  I had assumed it
wouldn't matter whether fill values or real data values were being
written across NFS; but there may be some optimization I'm not aware
of when all the values in a large array are the same.  I'd like to
investigate this issue further.

Could you make your netCDF file or CDL file available to me to
investigate this further?  Putting it on an FTP or HTTP server
temporarily until I can copy it would be ideal.  Otherwise it may take
me a while to create something here that demonstrates the same
performance problem over NFS, especially since I don't know whether it
depends on having a large number of small records, some relationship
between the record sizes and buffer sizes, or something else ...

--Russ