NetCDF operators (NCO) version 5.1.1

Version 5.1.1 of the netCDF Operators (NCO) has been released. NCO is an Open Source package that consists of a dozen standalone, command-line programs that take netCDF files as input, then operate (e.g., derive new data, average, print, hyperslab, manipulate metadata) and output the results to screen or files in text, binary, or netCDF formats.

The NCO project is coordinated by Professor Charlie Zender of the Department of Earth System Science, University of California, Irvine. More information about the project, along with binary and source downloads, are available on the SourceForge project page.

From the release message:

Version 5.1.1 add features for NCZarr, regridding, and interpolation. All operators now support NCZarr I/O and input filenames via stdin. ncremap supports two new vertical extrapolation methods, 1D files, and allows flexible masking based on external fields such sub-gridscale extent. ncclimo outputs regional averages. Numerous minor fixes improve codec support and regridding control. All users are encouraged to upgrade to this feature-rich release.

New Features
  1. All operators now support specifying input files via stdin. This capability was implemented with NCZarr in mind, though it can also be used with traditional POSIX files. The ncap2, ncks, ncrename, and ncatted operators accept one or two filenames as positional arguments. If the input file is provided via stdin, then the output file, if any, must be specified with -o so the operators know whether to check stdin. Multi-file operators (ncra, ncrcat, ncecat) will continue to identify the last positional argument as the output file unless -o is used. The best best practice is to use -o fl_out to specify output filenames when stdin is used for input filenames:
    echo | ncks
    echo | ncks -o
    echo "" | ncbo -o
    echo "" | ncflint -o
  2. All NCO operators support NCZarr I/O. This support is currently limited to the "file://" scheme. Support for the S3 scheme is next. All NCO commands should work as expected independent of the back-end storage format of the I/O. Operators can ingest and output POSIX, Zarr, or a mixture of these two file formats.
    ncks ${in_ncz} # Print contents of Zarr file
    ncks -O -v var ${in_psx} ${out_psx} # POSIX input to POSIX output
    ncks -O -v var ${in_psx} ${out_ncz} # POSIX input to Zarr output
    ncks -O -v var ${in_ncz} ${out_psx} # Zarr input to  POSIX output
    ncks -O -v var ${in_ncz} ${out_ncz} # Zarr input to Zarr output
    ncks -O --cmp='gbr|shf|zst' ${in_psx} ${out_ncz} # Quantize/Compress
    ncks -O --cmp='gbr|shf|zst' ${in_ncz} ${out_ncz} # Quantize/Compress
    Commands with Zarr I/O behave mostly as expected. NCO treats Zarr and POSIX files identically once they are "opened" via the netCDF API. Hence the main difference between Zarr and POSIX, from the viewpoint of NCO, is in handling the filenames. By default NCO performs operations in temporary files that it moves to a final destination once the rest of the command succeeds. Supporting Zarr in NCO means applying the correct procedures to create, copy, move/rename, and delete files and directories correctly depending on the backend format.

    Many NCO users rely on POSIX filename globbing for multi-file operations, e.g., 'ncra in*.nc'. POSIX globbing returns matches in POSIX format (e.g., '') which lacks the "scheme://" indicator and the "#mode=..." fragment that the netCDF API needs to open a Zarr store. There is no perfect solution to this.

    A partial solution is available by judiciously using NCO's new stdin capabilities for all operators. The procedure relies on using the 'ls' command (instead of globbing) to identify the desired Zarr stores, and piping the (POSIX-style) results of that through the newly supplied NCO filter-script that will prepend the desired scheme and append the desired fragment to the matched Zarr stores, and pipe those results to the NCO operator:
    ncra in*.nc      # POSIX input files via globbing
    ls in*.nc | ncra # POSIX input files via stdin
    ls in*.nc | ncz2psx | ncra # Zarr input via stdin
    ls in*.nc | ncz2psx --scheme=file --mode=nczarr,file | ncra
    Thanks to Dennis Heimbigner of Unidata for implementing NCZarr.
  3. The --glb_avg switch causes the splitter to output global-mean timeseries files. That has been true since 2019. This switch now causes the splitter to output three horizontally spatially averaged timeseries. First is the global average (as before), next is the northern hemisphere average, followed by the southern hemisphere average. The three timeseries are now saved in a two-dimensional (time by region) array with a "region dimension" named rgn. Region names are stored in the variable named region_name:
    ncclimo --split --rgn_avg # Produce regional and global averages
    ncclimo --split --glb_avg # Same (deprecated switch name)
    Thanks to Chris Golaz of LLNL for suggesting this feature.
  4. ncremap has long been able to re-normalize and/or mask-out fields in partially unmapped destination gridcells. The --rnr_thr option set the threshold value for valid cell coverage. However, the implementation considered only the fraction of each gridcell left unmapped due to explicit missing values (i.e., _FillValue). Now the implementation can also mask by the value of a specified sub-gridscale (SGS) variable, e.g., landfrac. The --add_fll switch now sets to _FillValue any gridcell whose sgs_frc < rnr_thr. The --add_fll switch is currently opt-in, except for datasets produced by MPAS and identifed as such by the -P option. The new --no_add_fll overrides and turns off any automatic --add_fll behavior:
    ncremap ...           # No renormalization/masking
    ncremap --rnr=0.1 ... # Mask cells missing > 10%
    ncremap --rnr=0.1 --sgs_frc=sgs ... # Mask missing > 10%
    ncremap --rnr=0.1 --sgs_frc=sgs --add_fll ... # Mask missing > 90% or sgs < 10%
    ncremap -P mpas... # --add_fll implicit, mask where sgs=0.0
    ncremap -P mpas... --no_add_fll # --add_fll explicitly turned-off, no masking
    ncremap -P mpas... --rnr=0.1 # Mask missing > 90% or sgs < 10%
    ncremap -P elm...  # --add_fll not implicit, no masking
    Thanks to Jill Zhang of LLNL for suggesting this capability.
  5. The map checker diagnoses from the global attributes map_method, no_conserve, or noconserve (if present) whether the mapping weights are intended to be conservative (as opposed to, e.g., bilinear). Weights deemed non-conservative by design are no longer flagged with dire WARNING messages. Thanks to Mark Taylor of SNL for this suggestion.
    ncks --chk_map
  6. ncremap vertical interpolation supports two new extrapolation methods: linear and zero. Linear extrapolation does exactly what you think: Values outside the input domain are linearly extrapolated from the nearest two values inside the input domain. Invoke this with --vrt_xtr=lnr or --vrt_xtr=linear. Zero extrapolation sets values outside the extrapoloation domain to 0.0. Invoke this with --vrt_xtr=zero.
    ncremap --vrt_xtr=zero
    ncremap --vrt_xtr=linear
    ncks --rgr xtr_mth=linear
    ncks --rgr xtr_mth=zero
  7. All numerical operators offer robust support for Blosc codecs when linked to netCDF 4.9.1+. This includes Blosc Zstandard, LZ, LZ4, and Zlib. Thanks to Dennis Heimbigner of Unidata for upstream fixes.

Additional details are available in the ChangeLog.


Post a Comment:
  • HTML Syntax: Allowed
News and information from the Unidata Program Center
News and information from the Unidata Program Center



Developers’ blog

Recent Entries:
Take a poll!

What if we had an ongoing user poll in here?

Browse By Topic
Browse by Topic
« December 2022