Re: [netcdf-java] 4.0 updates: C and java speed

Hi all,

Joe has a good point here - and nj4 implements a file handle caching
system that I've used very successfully.  (see
NetcdfDataset.initNetcdfFileCache(), and there might be other ways to
set it up too.)

But my (strong) advice would be to really nail down where your Java
program is spending its time.  You can use a Java profiler for this,
or simply add some calls to System.nanoTime() in key places.  You
might find a few surprises.

Cheers, Jon

On Thu, May 7, 2009 at 8:04 PM, Joe Sirott <Joe.Sirott@xxxxxxxx> wrote:
> Hi Bill,
>
> One question I have: does it really matter if your Java Web application is 2
> or 3x slower than your C application? You mentioned that your current
> application takes significantly less than one second to produce a plot; even
> if your Java Web app takes, say, one second to produce a plot that still
> would allow for ~100,000 plot requests per day. And you could easily
> increase this capability by implementing a caching scheme.
>
> If you do need to speed up your application, I found when I profiled the
> Java netCDF library a couple of years ago athat  it can be more expensive to
> open a netCDF file than to read small amounts of data (like 2D slices from a
> 4D variable) from the file. So one strategy (at least in a Web application
> environment) is to keep the file open so repeated reads of the file don't
> incur the overhead of reopening the file. There are some issues with this --
> the library isn't thread safe, so you don't want to share the file object
> across threads, and you might run into a problem with too many open files if
> you have a lot of files, but there are strategies to work around this.
>
> - Joe
>
> Bill Moninger wrote:
>
> Hello John, Jon, and Bob,
>
> Thanks for your useful questions and comments.
>
> I was testing the timing from the command line, and I agree that java
> startup time might have been a big issue.
>
> So I took the lead from the modified program that John sent back, which did
> a loop of opening and the netcdf file, and pulling a hyperslab (a hyperline
> really) out of the file, then closing it.
>
> I amended both the java and C programs (attached as a tar file) to take the
> number of times through the loop as the sole argument and got the following
> results when reading the netcdf file available at
> http://ruc.noaa.gov/ruc_native_40.nc (53 M in size):
>
> %> sounding.x 10000
> C: elapsed time for 10000 reads is 16.630000 seconds
> (varied between 13.9 and 16.6 secs)
>
> %> java -server Tester2 10000
> java: elapsed time for 10000 reads is 44.466998 seconds
> (varied between 20.1 sec and 44.5 sec)
>
> So, it looks like something other than the startup cost is causing java to
> be slower than C by about 1.2 to 2.5x. But the java times appear to be a lot
> more variable than the C times.
>
> Perhaps I am using the libraries non-optimal; if so, I'll be very grateful
> for any suggestions
>
> -Bill

-- 
Dr Jon Blower
Technical Director, Reading e-Science Centre
Environmental Systems Science Centre
University of Reading
Harry Pitt Building, 3 Earley Gate
Reading RG6 6AL. UK
Tel: +44 (0)118 378 5213
Fax: +44 (0)118 378 6413
j.d.blower@xxxxxxxxxxxxx
http://www.nerc-essc.ac.uk/People/Staff/Blower_J.htm



  • 2009 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdf-java archives: