[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 950531: ncrename performance



>Organization: NCAR/CGD
>Keywords: 199506010529.AA07402

Hi Charlie,

> i'm trying to write the ncrename netcdf operator. i'm not sure
> how the various variable, dimension, and attribute renaming
> functions actually work so i'd like some advice. 
> 
> let's assume the new names are longer than the old names, the
> worst case scenario. let's also assume that i want to rename
> more than one thing in the input file. finally, let's assume
> all the old and new names are valid. i see two ways of
> proceeding:
> 
> 1. run through the input file dimension by dimension, then variable by
> variable and attribute by attribute within each variable and copy all the
> metadata to an output file in define mode, changing the names of the
> affected stuff as you go. then quit define mode and copy all the data.

This is possible, but a bit harder than it sounds, because you have to allow
for the fact that a variable may be too large to malloc enough memory to
read it all in at once, and a variable may have up to MAX_NC_DIM (currently
32) dimensions.  We've solved these problems with something we call "the
dreaded and recurring odometer code" that's used in the current netCDF
operators package, with slight variations, in the myvarcpy() function from
the ncopers/lib/var.fc file and the ncvarcpy() function in
ncopers/ncstat/ncvarcpy.c file.  These took quite a bit of time to get
right, so it might be worth reusing one of them if you need to use this
approach.

> 2. copy the input file to the output file intact (using, say,
> the UNIX copy command). then search the output file 
> dimension by dimension, then variable by variable and
> attribute by attribute within each variable and rename
> the affected stuff as you go. then you're done without ever leaving
> define mode.

This would certainly be easier, and I think it would be considerably more
efficient, since it wouldn't have to convert the data from XDR to native and
back to XDR from the ncvarget() and ncvarput() calls.

> i'm not sure which method i should give better performance
> under the worst case scenario i've outlined, when all is said and
> done. it would help to know if the rename changes are manifested
> after each rename call, (say, ncdimrename) or if netcdf
> accumulates the rename info. and changes everything all at once when you
> leave define mode or exit the program. i hope it's the latter,
> but i'm not sure. i want to avoid excessive disk activity in the
> worst case scenario. any suggestions you have are appreciated.

The rename changes made in define mode are accumulated until you leave
define mode or close the file.  Note that you *can't* just exit the program
without closing the file, because ncclose() must be called on files open for
writing and exit doesn't make sure it's called (unless you've registered the
ncclose calls through atexit(3)).

--Russ

______________________________________________________________________________

Russ Rew                                           UCAR Unidata Program
address@hidden                              http://www.unidata.ucar.edu