Re: [netcdf-java] Aggregation on an inner time dimension

Martin Price wrote:
Hi John,

On Thu, Aug 13, 2009 at 9:01 PM, John Caron<caron@xxxxxxxxxxxxxxxx> wrote:
Hi Martin:

You currently cannot do an aggregation on an inner dimension.

I thought that might be it!  I guess it's a bit unconventional to have
time as the inner dimension, but it does give a big performance
improvement for long timeseries (of course it correspondingly slows
down map-type extractions, but we do those less often).

Can you give me a use case and perhaps a sample file? Are you extracting a
time series at a single point ??

Yes we extract 10-year plus time series of hourly hindcast data from
six-hourly model run files,  either one point or more usually a
handful of them.  These are then processed for climatology statistics,
to look at specific events, or supplied to a user.  Converting them to
e.g. monthly files does improve extraction time somewhat, but not
nearly as much as permuting them so time is the inner dimension.  To
lowest order, permuting reduces extraction time by
1/(length_of_time_dimension).
If im thinking about this correctly, this will be true when you only want 1 point, since each time point will cost you a disk read.

If you want a "handful", for example 6 points along the lon (the inner dimension (time, lat, lon) for current implementation), then im thinking you would get the same performance.

Outer aggregations work nicely since you can read all data for one file, then the next, etc. An inner aggregation would require reading all data into memory, but in your use case thats probably not a problem.

We are working on an experimental "ncstream" protocol that allows a writer to write data in any order, and the reader rearranges as needed, but its not ready for use yet.

BTW, who is consuming the output? An internal process that you control, or ???



  • 2009 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdf-java archives: