NetCDF operators (NCO) version 4.4.8

Version 4.4.8 of the netCDF Operators (NCO) has been released. NCO is an Open Source package that consists of a dozen standalone, command-line programs that take netCDF files as input, then operate (e.g., derive new data, average, print, hyperslab, manipulate metadata) and output the results to screen or files in text, binary, or netCDF formats.

The NCO project is coordinated by Professor Charlie Zender of the Department of Earth System Science, University of California, Irvine. More information about the project, along with binary and source downloads, are available on the SourceForge project page.

From the release message:

NCO now implements a lossy compression feature distinct from the packing ( scale_factor+add_offset) that NCO has long supported. The new feature is activated by specifying desired level of precision in terms of either the total number of significant digits or the number of significant digits after (or before) the decimal point. These precision features are lumped together under the generic name Precision-Preserving Compression (PPC), summarized below.

Specifying more reasonable and optimized chunking maps has been made easier by the addition of a new "best practices" policy which implements Rew's balanced chunking for three-dimensional variables, and LeFter-Product (lfp) chunking for all others.

New ncwa/ncra/nces arithmetic operators mabs(), mebs(), and mibs() simplify statistical analysis.

New Features
  1. NCO will now store data at a per-variable precision level. We call this Precision-Preserving Compression (PPC). PPC currently understands two types of precision. Users can specify either the total Number of Significant Digits (NSD) or the Decimal Significant Digits (DSD), meaning the number of significant digits after (or before) the decimal point. For example, NSD=5 tells NCO to retain 5 significant digits. Specifying DSD=3 or DSD=-2 causes NCO to preserve the number rounded to the nearest thousandth or hundred, respectively.

    Under the hood, NSD uses bitmasking for quantization, while DSD utilizes rounding. The bitmasking/rounding results in consecutive zero-bits ending the IEEE-754 storage of each floating point number. Standard byte-stream compression techniques, such as the DEFLATE compression used by gzip (and in HDF5), compress these zero-bits more efficiently than unrounded numbers. The net result is PPC makes netCDF files skinnier when compressed. Compression is internal with netCDF4 and external (e.g., gzip or bzip2) with netCDF3. Space savings can be large.

    And face it, how often does your precision exceed 3 digits? And don't worry, coordinate variables are not rounded :) An advantage of PPC is that (unlike packing), PPC needs no explicit support in other software because data stays in IEEE format. Thanks to Rich Signell for suggesting DSD compression for NCO.

    ncks --ppc default=5 --ppc temperature=3
    ncks --ppc AER.?,AOD.?,ARE.?,AW.?,BURDEN.?=3
    ncpdq --ppc default=4 --ppc grid_area=15 has extensive documentation.
  2. New "nco" chunking policy and modified "rew" chunking map: Policy "nco" is a virtual option that implements the best (in the subjective opinion of the authors) policy and map for typical usage. This combination will evolve with time. As of NCO version 4.4.8, this virtual policy implements map_rew for 3-D variables and map_lfp for all other variables. For the time being, map_rew does the same, i.e., it also calls map_lfp when variables are not 3-D. This ensures that Rew's balanced chunking is used on variables for which it applies, and another sensible default (lfp = Lefter Product) is used on all other variables big enough to chunk.

    ncks --cnk_plc=nco
    ncks --cnk_map=rew
  3. NCO dimension-reducing operators (ncra, ncwa, nces) now support three new arithmetic operations to facilitate statistics: mabs(), mebs(), and mibs(). These compute the maximum, mean, and minimum absolute value, respectively. They are invoked with the -y or --op_typ switch in the same manner as max/min/avg:

    ncwa -y mabs # Maximum absolute value
    ncra -y mebs # Mean absolute value
    nces -y mibs # Minimum absolute value
  4. NCO warns when appended output type differs from input type. Previously NCO would not warn or die when the user (usually inadvertently) wrote data of one type into a destination meant for a different type. These commands would therefore complete without warning:

    ncks -C -O -v double_var ~/nco/data/ ~/
    ncrename -O -v double_var,float_var ~/
    ncks -C -A -v float_var ~/nco/data/ ~/

    Now the user is warned though the operation is still permitted.

Additional details are available in the ChangeLog.


Post a Comment:
Comments are closed for this entry.
News and information from the Unidata Program Center
News and information from the Unidata Program Center



Developers’ blog

Recent Entries:
Take a poll!

What if we had an ongoing user poll in here?

Browse By Topic
Browse by Topic
« September 2020