[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: 19990929: Write/Read of 2D pages



Hi Lyn, i just got back from vacation.

> Hello:
>
> Below is the text of a message I sent about a week ago to
> John Caron, and I
> have yet to receive a reply.  I'm assuming he's busy.  If
> anyone could help
> me, I would really appreciate the advice.
>
> To put this into context, I asked John several months ago
> about the very
> slow performance of our I/O with netCDF.  He replied that it
> shouldn't be
> slow unless we have programmed something dumb.  I can send my original
> query and his reply if you are interested.  He offered to
> reply to future
> queries regarding this problem, and below is my latest posting.
>
>
> I have spent some time delving into the netCDF I/O that we
> are doing, and I
> believe I have found some serious time wasters in our code.
> I will show
> you below what I mean.  Anyway, I am somewhat stuck on retrieving
> information from a netCDF file, and I am hoping you might
> point me in the
> right direction.
>
> The poor performance in our program, as I mentioned earlier,
> is do to the
> use of nested loops to write/read multidimensional data.  I
> have been going
> through the code and replacing this logic with equivalent
> MultiArray calls
> to write the data in one pass.  For example, the original
> code to write a
> 3-dimensional array looked like this:
>
> private void saveTripleArrayDoubleData(NetcdfFile cdfFile,
> String dataName,
> double[][][] data)     {
>     try {
>         Variable storage = cdfFile.get(dataName);
>         if (storage == null) throw new IOException();
>         int[] numberIndex = new int[3];
>         int totalX = data.length;
>         int totalY = data[0].length;
>         int totalZ = data[0][0].length;
>
>         for (int x = 0; x < totalX; x++) {
>             numberIndex[0] = x;
>             for (int y = 0; y < totalY; y++) {
>                 numberIndex[1] = y;
>                 for (int z = 0; z < totalZ; z++) {
>                     numberIndex[2] = z;
>                     storage.setDouble(numberIndex, data[x][y][z]);
>                 }
>             }
>         }
>     } catch(java.io.IOException exception) {}
>     catch(java.lang.NullPointerException exception) {}
> }
>
> As you can see in this method, each element of the 3D array is set
> individually, which as you can imagine, takes forever, because this
> operation can't be buffered.  The original code used similar
> logic to read
> data back in, using nested loops and the getDouble() element accessor.
>
> I have replaced this code with the following method:
>
> private void saveTripleArrayDoubleData(NetcdfFile cdfFile,
> String dataName,
> double[][][] data) {
>     try {
>         Variable storage = cdfFile.get(dataName);
>         if (storage == null) throw new IOException();
>         // use multiarray to write data all at once, rather
> than one at a time
>         MultiArray dataMa = new ArrayMultiArray(data);
>         int[] origin = {0, 0, 0};
>         storage.copyin(origin, dataMa);
>
>     } catch(java.io.IOException exception) {}
>     catch(java.lang.NullPointerException exception) {}
> }
>
> I have tested this code and it works great.  The problem I'm
> having now is
> with accessing the data I've written.

this looks good to me

>
> I'm not a seasoned veteran using Java, so please bear with
> me.  To read
> back the MultiArray data, the method copyout() is used.  There are two
> parameters, the origin and shape, both int arrays.  I think I
> understand
> what both of these are doing, the first passing where to
> start reading, and
> the second the actual dimensions of the MultiArray.  So far,
> no problem.
>
> My problem involves converting the returned MultiArray into
> something I can
> use in my program.  When we write the 3D array, what we are doing is
> writing "pages" of 2D arrays, and when we want to read in the
> data, we only
> want to access a single page.  The first dimension controls
> which page.  In
> the example code you have on the web site, the temperature
> array (double
> T(time, lat, lon)), is similar in structure to what we're
> doing, in that
> each time is a page, and we want to access a particular time
> and read back
> in the 2D array corresponding the (lat, lon) temperatures.
> Right now, I'm
> setting the first value in the origin array to the particular
> page I want
> to retrieve (which matches the organization in the schema).  I haven't
> tested this yet to make sure it is doing what I want.

so you set  origin = {pageno, 0, 0)
                shape =  {1, nx, ny}

and you get a 2D MultiArray of dimension nx by ny.
At this point, the data is memory resident, and you could access the
data using
        data.getDouble(index);
which would be my recommendation, since it avoids further data copying.

However if you need to pass a double array to some other routine, you
can use toArray() to get a Java array, but it is a 1D, not 2D array. I
think in fact that it should construct the 2D array, but currently it
doesnt.

So you would unfortunately have to transfer this into a 2D array
yourself, in which case you might as well loop over the
data.getDouble(index) call rather than use toArray().