Re: Data manipulation

Hi-

Tennessee James Leeuwenburg wrote:
Hi guys,

I have a data manipulation question. I have a series of NetCDF files with the same internal structure. When loaded using InputNetcdf, they have the form

(Time) --> ( (X, Y) --> ([variable]*))
--------------------------
For example

(Time) --> ((x_ocean, y_ocean) --> (eta_t, hbl, sst, sss))
--------------------------

And I would like to break that up into

[(Time) --> (( X, Y) --> (var))]*
---------------------------
For example

(Time) --> ((x_ocean, y_ocean) --> (eta_t),
(Time) --> ((x_ocean, y_ocean) --> (hbl),
(Time) --> ((x_ocean, y_ocean) --> (sst),
(Time) --> ((x_ocean, y_ocean) --> (sss)
---------------------------

i.e. factor out the variables so they each has a unique data references.

There is no internal method to get the structure you want, but
you could have:

(Time --> (((x_ocean, y_ocean) --> (eta_t),
           ((x_ocean, y_ocean) --> (hbl),
           ((x_ocean, y_ocean) --> (sst),
           ((x_ocean, y_ocean) --> (sss)))

if you set the import strategy for the NetcdfAdapter to be
UNMERGED_FILE_FLAT_FIELDS:

     NetcdfAdapter.setDefaultStrategy(UNMERGED_FILE_FLAT_FIELDS);

before you use your InputNetcdf.

(hope I got my parens in the right places. ;-))

You could then call FieldImpl.extract() to get each variable
as a separate field.


The problem I'm running into is that FieldImpl has a getDomainSet() but no getRangeSet() - it seems like I have to get each indexed range sample individually getSample(n). This strikes me as being rather inefficient, as I will have to parse over every sample in the time array, then every sample in the x,y array in order to factor out the variables. The problem is that (a) this is computationally inefficient and (b) that I think it will also need to use some extra memory, which is already a problem for our data sets.

Maybe someone could enlighten me as to a better way?

There's no guarantee that the range of a Field is going to be
a set of the form you have in this case.  In the IDV, we have
a GridUtil class full of static methods for manipulating
grids of the sort you have.  You could write your own set of
grid utilities.   For example, we handle manipulations of
a grid of:

(Time -> (x,y,z) -> variable)

by looping through each of the timesteps and then reconstructing
the new grid.

The ultimate goal is that I want to toggle the visibility of each variable (i.e. temperature, salinity, U, V) independantly, which seems to require having separate data references for each and using dataReference.toggle(false).

Yup, that's the way to do it.

I also have some three-dimensional data which includes a z_ocean index, and which I may have to factor out in order to allow people to toggle the levels independantly.

You could get the grid and do a domainFactor to factor out the
levels and then extract each level into a separate data reference.
In the IDV, we do this by resampling the grid at a constant level
and putting that in the display.  If we want to display more than
one level at a time, then we have separate DataReferences and
put the appropriate slice in each reference.

Download the IDV source and take a look at the ucar.unidata.data.grid.GridUtil class.

Don Murray
*************************************************************
Don Murray                               UCAR Unidata Program
dmurray@xxxxxxxxxxxxxxxx                        P.O. Box 3000
(303) 497-8628                              Boulder, CO 80307
http://www.unidata.ucar.edu/staff/donm
"There's someone in my head, but it's not me"    Roger Waters
*************************************************************