[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 950612: multiple unlimited dimensions



> >From: address@hidden (Dan Hansen)
> >Organization: NCAR/MMM
> >Keywords: 199506122107.AA25115 netCDF

Hi Dan,

Sorry it's taken me so long to answer your question.  Things have piled up,
and I've let a few things slip from lack of time.

>  We've got a problem we need to ask you about.  We've gotten into
>  the habit of using netcdf, so if we can solve it we would be
>  happy indeed.  
>  
>  The general question is - how do people deal with variables that
>  have more than one "variable" dimension?
>  
>  More specifically, we have image data we want save.  We have
>  our unlimited dimension TIME, but in addition we have two dims
>  for each image, name ROWS and SLICES.  ROWS is fixed for each
>  probe, but SLICES varies on an image by image basis.  
>  
>  How have other people dealt with this issue?  I tried to hunt
>  down some info in the mail archive but couldn't find anything
>  pertaining to this issue (I'm sure it's buried in there someplace,
>  but I couldn't see it!).

I couldn't find it in the archives either, although I know I've composed at
least one message on this subject in the past.

I know of several ways people deal with this problem, but this may not be a
complete list, because lots of people are using netCDF in unanticipated ways
that I don't learn about until a question comes in to support.

One way to deal with the need for multiple unlimited dimensions is to use
multiple netCDF files instead of a single netCDF file for the data.  Since
each file can have its own independent unlimited dimension, this works in
cases where you don't require multiple unlimited dimensions for the same
variable, but it sounds like your image variable uses both TIME and SLICES,
so this wouldn't work in that case.

Another less-than-satisfactory solution is to determine an upper limit for
one of the varying dimensions and use that fixed upper limit, wasting space
for all variables that use less than the maximum size of the dimension.
This is OK only if you can live with the wasted space.

A third possibility is to use a fixed size for a dimension you want to have
varying, and whenever you would exceed the fixed size, double it and recopy
the data to a new netCDF file with the bigger dimension.

Another solution is to use one unlimited dimension to represent ordered
pairs of two unlimited dimensions, and support the mapping from the two
varying dimensions to the single dimension at a layer above the netCDF
library.  We are using something similar to this in some of our netCDF
files, where we would like to have two times associated with model outputs,
a reference time (for the initial time of the model) and a varying number of
valid times (for forecast outputs after an initial model run).  We
support a varying number of reference times and valid times with a record
dimension representing both, as I hope the following CDL excerpt makes
clear:

  dimensions:

          record = UNLIMITED ;  // (reference time, forecast time)
          ...
  variables:

          double        reftime(record);  // reference time of the model
                  reftime:long_name = "reference time";
                  reftime:units = "hours since 1992-1-1";

          double        valtime(record);  // forecast time ("valid" time)
                  valtime:long_name = "valid time";
                  valtime:units = "hours since 1992-1-1";

          :record = "reftime, valtime" ;  // "dimension attribute" -- means
                                          // (reftime, valtime) uniquely
                                          // determine record

          ...
          float T(record, level, y, x) ;
                  T:long_name = "Temperature" ;
                  T:units = "degK" ;
                  T:_FillValue = -9999.f ;
                  T:navigation = "nav" ;

There may be other ways to handle this problem that I can't think of right
now.

Finally, we are currently investigating at a solution for a future version
of netCDF that would explicitly support ragged arrays as part of support for
"nested arrays".  You can think of doing this now in a layer above the
netCDF library by using something like pointers in your arrays that really
contain variable IDs for a large number of artificial variables, one for
each row of a ragged array.  These variables don't even need useful names,
since the software would only use the variable id to access the data.  Each
variable could be a different size, using a variable-specific fixed
dimension to represent its shape.  You need to preallocate space for all
these variables and dimensions by defining them first, but once they are
defined and space reserved for them, its possible to use their ids as
pointers into preallocated space of varying sizes.

--Russ