[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[netCDF #TLB-677315]: FW: Net CDF-4 Issues

Subject: [netCDF #TLB-677315]: FW: Net CDF-4 Issues
Date: Fri, 15 May 2009 11:14:12 -0600
Hi Coy,

> I finally found time to try out your suggestion. Unfortunately, I don't
> think variable-length arrays will work well for my application as you
> need to read the entire array at one time. I could be wrong, but let me
> give you a simple example to see if you know of a better way to go. If
> not, can you explain the best way to use variable-length arrays in this
> case.

You may be right; it depends on how much data you have for each
variable-length array, in particular whether you can accumulate all
the data for a particular mode in memory and only write it out when
you are about to switch modes.  You could sort of hide this by having
an intermediate "write_next" function you call with each bunch of
data, and have that function accumulate the data in allocated memory.
When you switch modes, have that function write out the resulting
variable-length array(s) in netCDF calls before freeing the allocated
memory.

Reading could work similarly with a "read_next" function that reads a
varlen into memory on the first call and hands back one time's worth
of data for each call, until all the data in the memory buffer has
been returned.

> 1) I have an instrument that is acquiring data.
> 2) This instrument can acquire data with different sets of parameters
> (modes)
> 3) The instrument can switch modes at anytime
> 4) The NetCDF data file can hold data acquired from multiple modes
> 5) When the file is first created, it is not known which modes will be
> included in the file.
> 
> For example, let's say there are 10 possible modes. Mode 1 acquires data
> from 10 heights. Thus there is a parameter "NumberOfHeights = 1" along
> with "ModeName = Mode_1". Mode 2 acquires data for 20 heights. Mode
> three acquires data for 30 heights.... Mode 10 acquires data for 100
> heights. It is my understanding that when I create the NetCDF file, I
> have to define all these modes before writing data to the file. Thus
> there would be 10 modes defining 'NumberOfHeights' and 'ModeName' along
> with space to write 100 heights for each mode. But let's say that the
> file holds only 1 hour's worth of data and ends up including only mode's
> 1 and 2. If I knew this in the beginning, I would not include values for
> 'NumberOfModes' and 'ModeName' for modes 3 through 10 and space would
> not be reserved for all the extra heights that would not be acquired.
> This makes the file more compact and accurate.

Defining a user-defined type does not allocate any space in the file
for data of that type.  Thus defining types that you don't use incurs
very little space penalty.  In your example above, you could define
all 10 user-defined types, only write data for modes 1 and 2, and no
extra padding would be included for data you didn't write for modes 3
through 10.  The space taken for a data type definition is relatively
small, proportional to the number of characters it takes to define the
type.  But see below, you don't have to define all the types at the
beginning.

> Ideally, what I would like to do is open the file and define information
> for the first mode. Then write the data for that mode. I may continue to
> write data for that mode for some time, but when the mode changes I
> could define the information for the new mode and add data from that
> mode. For example, the file might be created with information for mode 3
> and data for 30 heights is written for the next 10 minutes. Then the
> instrument switches to mode 6. At this point, information about mode 6
> is added to the file and the number of heights is expanded to hold 60
> heights. Then data from mode 6 is added. The instrument may then switch
> back to mode 3 for awhile and then to mode 4. At this point information
> for mode 4 is added but the number of heights does not need to be
> expanded. So data from mode 4 can be added.

In netCDF-4, you don't have to define all the types at the beginning.
You can define a type, write some data of that type, then later define
another type before using it to write data.  There is no penalty or
copying of data needed if you add new types, dimensions, variables,
attributes, or groups after writing some data.  This differes from
netCDF-3, where changing the file schema after writing data can be
expensive becasue the data may need to be copied.

> Is this possible? Is there a clean, easy way to do this? Any help would
> be greatly appreciated.

When I get time, I'll send an example of CDL that's my interpretation
of the kind of data structure you might use, from your description
above.  I'm not very familiar with this kind of observational data, so
I may have gotten some of the structure wrong, but this may give you a
vague idea of one way to structure the data, or it may indicate a
misunderstanding on my part that you could clarify.

--Russ


Russ Rew                                         UCAR Unidata Program
address@hidden                     http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: TLB-677315
Department: Support netCDF
Priority: Normal
Status: Closed
Prev by Date: [Support #KOM-880811]: chunksize, NC_STRING questions
Next by Date: [netCDF #ZXR-914718]: netcdf 4.0.1 with pgi on Linux - 4 errors in tests
Previous by thread: [netCDF #TLB-677315]: FW: Net CDF-4 Issues
Next by thread: [netCDF #TLB-677315]: FW: Net CDF-4 Issues
Index(es):
- Date
- Thread