[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 19990711: extra header space



>To: address@hidden
>From: Michael Nolta <address@hidden>
>Subject: extra header space
>Organization: Princeton
>Keywords: 199907110517.XAA12464 netCDF

Hi Michael,

You wrote:

> I'm having a little trouble creating extra space in the header using the
> '__' functions. The little test program I wrote isn't behaving as I
> expected. It creates a file with an integer array, fills the array, and
> then reenters define mode to add a "title" global attribute. Looking at
> the output file, it seems that adding the "title" attribute causes the
> "x" variable data to be rewritten, overwriting the extra header space in
> the process. What am I doing wrong?
 ...
> Perhaps a few crude diagrams will help.
> If I comment out the section adding the "title" attribute, I get a file
> that looks like this (not to scale ;-):
> 
>       **000000000000000000abcdefghijklmnopqrstuvwxyz
> 
> where * is the header information, 0 is blank, and abc... is the data.
> Writing "title" gives me a file that looks like this:
> 
>       ********abcdefghijklmnopqrstuvwxyzopqrstuvwxyz
> 
> What I expected to get was:
> 
>       ********000000000000abcdefghijklmnopqrstuvwxyz
> 
> i.e., "title" is written into the blank space and the data is left
> untouched. What appears to happen is that the header expands and then the
> data is written a second time immediately preceding the (non-blank) header
> info.
> 
> Basically, what I'm trying to do is add some room for expansion to the
> header, so I can add to it later and avoid copying the data section.

I've appended a patch for netcdf/libsrc/nc.c that I hope will fix the
problem.  I'm still doing some more testing on this and will then roll
it into a beta-test version of netCDF 3.5 which we will announce soon.
I'm grateful for your bug report, because I think this uncovered a
significant potential performance problem with netCDF.

In netCDF 3.4, calling nc_close() or nc_endef() would remove any extra
space in the header, causing all the data values to be moved up
(towards the beginning of the file) if the space needed for the file
header had decreased, either due to deleting an attribute or renaming
something with a shorter name.  This was not the intended behavior and
could be very costly for large datasets.  In this version, extra space
in the header persists and the size of the header will never shrink as
the result of netCDF function calls.  Hence space "reserved" by
calling the nc__endef() function or by defining a dummy variable with
a long name and later shortening it will be preserved for later
efficient additions of new dimensions, variables, or attributes.
There is still no way to reserve extra space for the future addition
of an additional record variable however; adding a new record variable
will always incur the cost of copying the data.

At your convenience, please try out the patch and let me know if it
doesn't fix the problem, or if you see any other problems with this.

Thanks!

--Russ

_____________________________________________________________________

Russ Rew                                         UCAR Unidata Program
address@hidden                     http://www.unidata.ucar.edu



diff -c nc.c~ nc.c
*** nc.c~       Fri Jul 16 15:29:05 1999
--- nc.c        Fri Jul 16 15:28:39 1999
***************
*** 205,215 ****
        if(ncp->vars.nelems == 0) 
                return;
  
!       index = (off_t) ncp->xsz;
!       ncp->begin_var = D_RNDUP(index, v_align);
!       if(ncp->begin_var - index < h_minfree)
        {
!               ncp->begin_var = D_RNDUP(index + (off_t)h_minfree, v_align);
        }
        index = ncp->begin_var;
  
--- 205,221 ----
        if(ncp->vars.nelems == 0) 
                return;
  
!       /* only (re)calculate begin_var if there is not sufficient space in 
header
!          or start of non-record variables is not aligned as requested by 
valign */
!       if (ncp->begin_var < ncp->xsz + h_minfree ||
!           ncp->begin_var != D_RNDUP(ncp->begin_var, v_align) ) 
        {
!         index = (off_t) ncp->xsz;
!         ncp->begin_var = D_RNDUP(index, v_align);
!         if(ncp->begin_var < index + h_minfree)
!         {
!           ncp->begin_var = D_RNDUP(index + (off_t)h_minfree, v_align);
!         }
        }
        index = ncp->begin_var;
  
***************
*** 229,238 ****
                index += (*vpp)->len;
        }
  
!       ncp->begin_rec = D_RNDUP(index, r_align);
!       if(ncp->begin_rec - index < v_minfree)
        {
!               ncp->begin_rec = D_RNDUP(index + (off_t)v_minfree, r_align);
        }
        index = ncp->begin_rec;
  
--- 235,251 ----
                index += (*vpp)->len;
        }
  
!       /* only (re)calculate begin_rec if there is not sufficient
!          space at end of non-record variables or if start of record
!          variables is not aligned as requested by r_align */
!       if (ncp->begin_rec < index + v_minfree ||
!           ncp->begin_rec != D_RNDUP(ncp->begin_rec, r_align) )
        {
!         ncp->begin_rec = D_RNDUP(index, r_align);
!         if(ncp->begin_rec < index + v_minfree)
!         {
!           ncp->begin_rec = D_RNDUP(index + (off_t)v_minfree, r_align);
!         }
        }
        index = ncp->begin_rec;