Re: [netcdf-java] 4.0 updates: C and java speed

Hi Bill,

One question I have: does it really matter if your Java Web application is 2 or 3x slower than your C application? You mentioned that your current application takes significantly less than one second to produce a plot; even if your Java Web app takes, say, one second to produce a plot that still would allow for ~100,000 plot requests per day. And you could easily increase this capability by implementing a caching scheme.

If you do need to speed up your application, I found when I profiled the Java netCDF library a couple of years ago athat it can be more expensive to open a netCDF file than to read small amounts of data (like 2D slices from a 4D variable) from the file. So one strategy (at least in a Web application environment) is to keep the file open so repeated reads of the file don't incur the overhead of reopening the file. There are some issues with this -- the library isn't thread safe, so you don't want to share the file object across threads, and you might run into a problem with too many open files if you have a lot of files, but there are strategies to work around this.

- Joe

Bill Moninger wrote:
Hello John, Jon, and Bob,

Thanks for your useful questions and comments.

I was testing the timing from the command line, and I agree that java startup time might have been a big issue.

So I took the lead from the modified program that John sent back, which did a loop of opening and the netcdf file, and pulling a hyperslab (a hyperline really) out of the file, then closing it.

I amended both the java and C programs (attached as a tar file) to take the number of times through the loop as the sole argument and got the following results when reading the netcdf file available at
http://ruc.noaa.gov/ruc_native_40.nc (53 M in size):

%> sounding.x 10000
C: elapsed time for 10000 reads is 16.630000 seconds
(varied between 13.9 and 16.6 secs)

%> java -server Tester2 10000
java: elapsed time for 10000 reads is 44.466998 seconds
(varied between 20.1 sec and 44.5 sec)

So, it looks like something other than the startup cost is causing java to be slower than C by about 1.2 to 2.5x. But the java times appear to be a lot more variable than the C times.

Perhaps I am using the libraries non-optimal; if so, I'll be very grateful for any suggestions

-Bill

On 5/6/2009 4:21 PM, John Caron wrote:
Hi Bill:

I made a few mods to your program (attached)

1) removed the print statements, which are notoriously slow.
2) did the whole open/read/close loop 100 times
3) added timing, and got:

that took 1248.659775 millisecs

which is about 13 msecs per call. When I get a chance I will try to compare to the C code.

None of this is all that definitive, its very hard to get accurate timings on small programs. For one thing, the java compiler happens at runtime, and its somewhat indeterministic. so running a program once will very likely look very bad. If you are doing a CGI type server, where the java application starts up for each request, that will be very slow.

I can pretty much promise you that java performance is within a factor of 2 of C code, and more likely within 20% of C code, in a long-running server environment. There are certain things it can do faster, like memory allocation and multithreading.

Anyway, I could look at your actual production code to see if there are some ways to help speed it up. It is possible that for various reasons, Java will be "several times slower" than C code, so you'll have to decide if the increase in productivity is worth it.

Bill Moninger wrote:

------------------------------------------------------------------------

_______________________________________________
netcdf-java mailing list
netcdf-java@xxxxxxxxxxxxxxxx
For list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/

  • 2009 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdf-java archives: