[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 970916: changing value of units attribute



>To: address@hidden
>From: James Boyle <address@hidden>
>Subject: changing value of units attribute
>Organization: .
>Keywords: 199709161736.LAA11195

Hi Jim,

[By the way, are you the same James Boyle who is one of the authors of
 EISPACK?  If so, Hi again.  If not, well, never mind ...]

> I have a number of very large files with dimension 'lev' which
> has attribute of units:
>   lev:units="level"
> 
> I want to change this to be:
> 
>   lev:units="hybrid_sigma_pressure"
> 
> Is there any easy way to do this? 

There is an easy but inefficient way, with a program that opens a netCDF
file, puts it in define mode, rewrites the attribute, and then closes
the file.  It's inefficient because the space required to store the
attribute is greater than before, so the library will essentially copy
the data in the file with the expanded attribute when you leave define
mode.

> Doing an ncdump and editing the CDL is not a viable
> option, the files are too large.
> This would appear to be a common enough situation that a
> utility should be available to perform this operation.

There are utilities that permit you to overwrite attributes, for
example, see ncks in the NCO package described in "Software for
Manipulating or Displaying NetCDF Data" at

    http://www.unidata.ucar.edu/packages/netcdf/software.html

But unfortunately these share the same inefficiency of copying all the
data to a new file.  This is a basic limitation of the current netCDF
data model, that schema information stored in the "header" cannot expand
without copying the data.  Here's a relevant excerpt from the User's
Guide:

    This header has no usable extra space; it is only as large as it
    needs to be for the dimensions, variables, and attributes (including
    all the attribute values) in the netCDF dataset.  This has the
    advantage that netCDF files are compact, requiring very little
    overhead to store the ancillary data that makes the datasets
    self-describing.  A disadvantage of this organization is that any
    operation on a netCDF dataset that requires the header to grow (or,
    less likely, to shrink), for example adding new dimensions or new
    variables, requires moving the data by copying it.  This expense is
    incurred when nc_enddef is called, after a previous call to
    nc_redef.  If you create all necessary dimensions, variables, and
    attributes before writing data, and avoid later additions and
    renamings of netCDF components that require more space in the header
    part of the file, you avoid the cost associated with later changing
    the header.

Our current prototype Java interface makes this cost more obvious by a
simplification of the netCDF data model, making attributes immutable.
This means changing an attribute value will require creating a new
dataset.  One of the arguments in favor of this simplification is that
we've seen very few other examples of anyone changing netCDF attributes
after they are defined.

--Russ

_____________________________________________________________________

Russ Rew                                         UCAR Unidata Program
address@hidden                     http://www.unidata.ucar.edu