Re: CDF, netCDF and HDF Note from you attached below

NOTE: The netcdf-hdf mailing list is no longer active. The list archives are made available for historical reasons.

To: hdf-netcdf@xxxxxxxxxxxxx, lloydt@xxxxxxxxxxxxxx
Subject: Re: CDF, netCDF and HDF Note from you attached below
From: "Glenn P. Davis" <davis@xxxxxxxxxxxxxxxx>
Date: Mon, 18 May 1992 10:32:45 -0600

> From owner-netcdf-hdf@xxxxxxxxxxxxxxxx Mon May 18 07:18:55 1992
> Date: Mon, 18 May 1992 08:50:13 EDT
> From: "Lloyd A. Treinish" <lloydt@xxxxxxxxxxxxxx>
> To: hdf-netcdf@xxxxxxxxxxxxx
> Subject: Re:  CDF, netCDF and HDF
> Subject: Note from you attached below
> 
> Thanks Glenn for the clarification.  This addresses one of the reasons I
> made the posting to the mailgroup -- to get current status on implementations
> and current thinking from the actual developers.
> 
> I assume by changing shape you mean NOT considering the unlimited dimension.

Correct. The data which varies by the unlimited dimension is at the end of
the file, so it can grow as needed. The constraint is that the "header" info
(attributes and such) and the data which does not vary according to the
unlimited dimension both appear earlier in the file, so if they change size,
stuff needs to get copied.

> Obviously, that dimension could be viewed as being part of the definition of
> shape for a variable.  It would appear that netCDF allows one to change the
> shape along that dimensional axis by adding instances of variables for the 
> rest
> of the shape definition without copying.

True.

> On the other hand, deleting an
> instance (i.e., a record in the conceptual equivalent in the CDF parlance) of
> a variable would also change the shape. Is this supported in netCDF without
> copying?  If so, is it done by just tagging the offending element?  Is any
> space compression/garbage collection done after repeated such operations
> because of the potential of wasted space? 

Strictly speaking, netCDF does not allow the deletion of data.
You can only change the value of data.
Space is allocated in the file system as you leave define mode.
Data which has not yet been "written" yet has a specical
"Fill Value" in it's storage as a placeholder.

Of course, space is not allocated for data which varies according to the
unlimited dimension. Suppose that the maximum record that has been written
is M. (Initially M = 0). File system space is allocated (and pre filled with
the Fill Value) for records M to N (N> M) when record N is written the first
time.


Note that none of this behavior is specified by the interface.
A "smarter" implementation could use some sort of linked list storage
on disk (VSETs?) which only contains data that has actually been written.
When a request for read came, it would have to do more complicated seeks
and table manipulation, but this is definitly doable. Indeed, this

> because of the potential of wasted space?  Clearly, a netCDF copy operation
> would take care of that, if required.  For large data sets, this could be an
> expensive operation.

You can enter (re)define mode and delete a whole variable. This generally
involves a copy.

-glenn

1992 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdf-hdf archives: