Re: Data manipulation

  • To: Tennessee James Leeuwenburg <tjl@xxxxxxxxxx>
  • Subject: Re: Data manipulation
  • From: Bill Hibbard <billh@xxxxxxxxxxxxx>
  • Date: Tue, 23 Nov 2004 06:49:39 -0600 (CST)
Hi Tennessee,

> I have a data manipulation question. I have a series of NetCDF files
> with the same internal structure. When loaded using InputNetcdf, they
> have the form
> (Time) --> ( (X, Y) --> ([variable]*))
> --------------------------
> For example
> (Time) --> ((x_ocean, y_ocean) --> (eta_t, hbl, sst, sss))
> --------------------------
> And I would like to break that up into
> [(Time) --> (( X, Y) --> (var))]*
> ---------------------------
> For example
> (Time) --> ((x_ocean, y_ocean) --> (eta_t),
> (Time) --> ((x_ocean, y_ocean) --> (hbl),
> (Time) --> ((x_ocean, y_ocean) --> (sst),
> (Time) --> ((x_ocean, y_ocean) --> (sss)
> ---------------------------
> i.e. factor out the variables so they each has a unique data references.
> The problem I'm running into is that FieldImpl has a getDomainSet() but
> no getRangeSet() - it seems like I have to get each indexed range sample
> individually getSample(n). This strikes me as being rather inefficient,
> as I will have to parse over every sample in the time array, then every
> sample in the x,y array in order to factor out the variables. The
> problem is that (a) this is computationally inefficient and (b) that I
> think it will also need to use some extra memory, which is already a
> problem for our data sets.
> Maybe someone could enlighten me as to a better way?

The Field method:

  Field extract(int component)

should help you do this. In:

  (Time -> ((x_ocean, y_ocean) -> (eta_t, hbl, sst, sss)))

you'll need to use getSample() to get the range Fields of Type
((x_ocean, y_ocean) -> (eta_t, hbl, sst, sss)), then 3 calls to
extract wth arguments 0, 1 and 2 will give you three Fields like:
((x_ocean, y_ocean) -> eta_t). You need to reassemble the times
sequences into Fields with Types like:

  (Time -> ((x_ocean, y_ocean) -> eta_t))

> The ultimate goal is that I want to toggle the visibility of each
> variable (i.e. temperature, salinity, U, V) independantly, which seems
> to require having separate data references for each and using
> dataReference.toggle(false).
> I also have some three-dimensional data which includes a z_ocean index,
> and which I may have to factor out in order to allow people to toggle
> the levels independantly.

For this, you need to call the Field method:

    Field domainFactor( RealType factor )

If you apply this to a Field with Type:

  ((x_ocean, y_ocean, z_ocean) -> eta_t)

with the factor argument = z_ocean, it should return a Field
with Type:

  (z_ocean -> ((x_ocean, y_ocean) -> eta_t))

By the way, the notation:

  (Time) --> ((x_ocean, y_ocean) --> (eta_t)

is not right. It should be:

  (Time -> ((x_ocean, y_ocean) -> eta_t))

This is what the system produces and parses.