Re: [netcdf-java] 4.0 updates: C and java speed

Hi Bill:

I made a few mods to your program (attached)

1) removed the print statements, which are notoriously slow.
2) did the whole open/read/close loop 100 times
3) added timing, and got:

that took 1248.659775 millisecs

which is about 13 msecs per call. When I get a chance I will try to compare to the C code.

None of this is all that definitive, its very hard to get accurate timings on small programs. For one thing, the java compiler happens at runtime, and its somewhat indeterministic. so running a program once will very likely look very bad. If you are doing a CGI type server, where the java application starts up for each request, that will be very slow.

I can pretty much promise you that java performance is within a factor of 2 of C code, and more likely within 20% of C code, in a long-running server environment. There are certain things it can do faster, like memory allocation and multithreading.

Anyway, I could look at your actual production code to see if there are some ways to help speed it up. It is possible that for various reasons, Java will be "several times slower" than C code, so you'll have to decide if the increase in productivity is worth it.

Bill Moninger wrote:
Hi John,

sorry for the delay in getting back to you.

I've attached two programs that read a netcdf file of RUC output in a hybrid coordinate system, on a 40km grid. One's in C, and one's in java.

A typical netcdf file that they read may be found at http://ruc.noaa.gov/ruc_native_40.nc (53 M in size).

Generally, I read 6 variables from this file to generate soundings (SkewT plots), and my C program reads the file and generates the sounding in far less than a second. The java version takes several times longer, and since we have many hits on our web page that generates on-the-fly soundings, it would be a big increase in load on our server to switch to java-netCDF.

I've stripped both the C and java programs down to the minimum. Each reads one vertical column of data from the 'vpt' variable and prints it out for each level. (The production program reads 6 variables from one vertical column and generates a soundings for that column.)

On my web server, the java program takes 0.58 seconds, and the C program takes 0.01 seconds.

It may be for my simple processing netcdf-java is just overkill. But if it can be sped up to approach the time of C, I'd like to use it because I can use the same code (with different config files) to read netCDF, grib, and grib2, and things would be much easier to manage.

Any thoughts you have will be gratefully received.

-Bill

On 4/4/2009 12:43 PM, John Caron wrote:


Bill Moninger wrote:
Hi Robb,

thanks for the information. I'll take a look at regenerating the gbx files.

For what its worth--the *biggest* percentage slowdown is not with grib or grib2 files, but with netCDF files, surprisingly enough. My c routine (using an earlier version of netCDF) reads the files almost instantly--the java-netcdf4 arrangement reads the file much more slowly.
Thats interesting. When you say "read" do you mean read all the data, or just opening the file? I assume netcdf 3 formatted files?

Can you send a sample program that has this slowdown? Are you comparing against a C program or earlier versions of java-netcdf?

BTW, java 1.6 should be 20-30% faster than java 1.5, particularly if you use the -server option.


-Bill

On 3/31/2009 12:56 PM, Robb Kambic wrote:
On Fri, 27 Mar 2009, Bill Moninger wrote:

Hello netcdf-java folks,

Thanks to good help from the netcdf-java staff, I'm now able to read and generate soundings from RUC files in netCDF, grib, and grib2 format. Its really nice to be able to use the same code for all three formats.

Unfortunately, I find that, at least as I've implemented it, netcdf-java is 20% to 50% slower than my previous methods (using C).


Bill,

If you use the the grib index file, those are the files with the gbx suffix that are usually in the same dir as the grib file. You should delete them all and then regenerate them. The new index file read in much quicker. Currently, i working on grib performance issues

Robb...





Moreover, it appears that java 1.6 is slower than 1.5 (though I haven't recompiled the underlying UCAR code in 1.6--only my code).

If folks have any thoughts about how to speed things up, I will be much obliged to hear them.

-Bill
--
William R. Moninger         http://www-frd.fsl.noaa.gov/~moninger/
NOAA / Earth Systems Research Laboratory / Global Systems Division
325 Broadway, R/GSD1                           voice: 303-497-6435
Boulder, CO 80305                              fax:   303-497-3329

_______________________________________________
netcdf-java mailing list
netcdf-java@xxxxxxxxxxxxxxxx
For list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/

===============================================================================
Robb Kambic                       Unidata Program Center
Software Engineer III Univ. Corp for Atmospheric Research
rkambic@xxxxxxxxxxxxxxxx           WWW: http://www.unidata.ucar.edu/
===============================================================================


_______________________________________________
netcdf-java mailing list
netcdf-java@xxxxxxxxxxxxxxxx
For list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/


import ucar.nc2.iosp.grib.*;
import ucar.nc2.iosp.grib.GribServiceProvider.*;

import java.*;
import java.io.*;
import java.util.*;

import ucar.ma2.*;
import ucar.nc2.*;
import ucar.nc2.util.*;
import ucar.nc2.units.DateFormatter;

public class Tester {
    private static final String VERSION = "0.01";

    double[] Tvar;

    public static void main(String[] args) {
        Tester gs = new Tester(args);
        System.exit(0);
    }

    public Tester(String[] args) {
        String filename = "D:\\work\\moninger\\ruc_native_40.nc";

        long start = System.nanoTime();
        int tuv_levels = 0;

        for (int count = 0; count < 100; count++) {
            NetcdfFile ncfile = null;
            try {

                ncfile = NetcdfFile.open(filename);
                //System.out.println("ncfile is "+ncfile);
                Array data4D;
                Variable v = null;
                Attribute a = null;

                // get grid parameters for most variables
                Dimension d = ncfile.findDimension("z");
                if (d == null) {
                    System.out.println("Bad dimension for z");
                    System.exit(1);
                }
                tuv_levels = d.getLength();

                // get variables
                int[] origin = new int[]{0, 0, 40, 50};
                int[] tuv_size = new int[]{1, tuv_levels, 1, 1};
                v = ncfile.findVariable("vpt");
                data4D = v.read(origin, tuv_size);
                Tvar = (double[]) data4D.reduce().get1DJavaArray(double.class);
                //System.out.println("successfully read " + filename);

            } catch (Exception e) {
                System.out.println("Exception: " + filename + " " + e);
                e.printStackTrace();
                System.exit(1);

            } finally {
                if (null != ncfile) try {
                    ncfile.close();
                    //System.out.println("closed file");
                } catch (IOException ioe) {
                    System.out.println("trying to close " + filename + " " + 
ioe);
                }
            }

          /*  for (int i = 0; i < tuv_levels; i++) {
                System.out.println("i: " + i + " t " + Tvar[i]);
            }  */


        }

        long stop = System.nanoTime();
        System.out.printf("that took %f millisecs %n", (stop - start) / 1000.0 
/ 1000.0);


    }
}


  • 2009 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdf-java archives: