News

Latest news from NSF Unidata and the community

NCZarr Support for Zarr Filters

May 23, 2021

Note: See github issue 2006 for additional comments.

To date, filters in the netcdf-c library referred to HDF5 style filters. The inclusion of Zarr support in the netcdf-c library (called NCZarr) creates the need to provide a new representation consistent with the way that Zarr files store filter information. For Zarr, filters are represented using the JSON notation. Each filter is defined by a JSON dictionary, and each such filter dictionary is guaranteed to have a key named "id" whose value is a unique string defining the filter algorithm: "lz4" or "bzip2", for example.

This document outlines the proposed process by which NCZarr will be able to utilize existing HDF5 filters. At the same time, it provides mechanisms to support storing filter metadata in the NCZarr container using the Zarr compliant Codec style representation of filters and their parameters.

NetCDF ZARR Data Model Specification

Jul 2, 2019

This document defines the initial netcdf Zarr (NCZarr) data model to be implemented. As the Zarr version 3 specification progresses, this model will be extended to include new data types.

NCZarr Overview

Jul 2, 2019

The Unidata NetCDF group is proposing to provide access to cloud storage (e.g. Amazon S3) by providing a mapping from a subset of the full netCDF Enhanced (aka netCDF-4) data model to one or more existing data models that already have mappings to key-value pair cloud storage systems.

The initial target is to map that subset of netCDF-4 to the Zarr data model [1]. As part of that effort, we intend to produce a set of related documents that provide a semi-formal definition of the following.

Part 5: Converting GRIB to NetCDF-4

Sep 13, 2014

Trying compression algorithms other than deflate on GRIB data.

Part 4: Converting GRIB to NetCDF-4

Aug 25, 2014

Some details on using deflate for compressing GRIB-2 data, with a comparision to JPEG-2000 wavelet compression.

Part 3: Converting GRIB to NetCDF-4

Aug 13, 2014

Trying out deflate compression on GRIB-1 data.

Part 2: Converting GRIB to NetCDF-4

Aug 12, 2014

May I have the envelope, please?

Converting GRIB to NetCDF-4

Aug 7, 2014

Rewriting GRIB files into NetCDF/CF, part 1.

Chunking Data: Choosing Shapes

Mar 28, 2013

In part 1, we explained what data chunking is about in the context of scientific data access libraries such as netCDF-4 and HDF5, presented a 38 GB 3-dimensional dataset as a motivating example, discussed benefits of chunking, and showed with some benchmarks what a huge difference chunk shapes can make in balancing read times for data that will be accessed in multiple ways.

In this post, I'll continue looking at that example dataset to see how we can derive good chunk shapes, generalize to other datasets, look at how long it can take to rechunk a multidimensional dataset, and look at the use of Solid State Disk (SSD) for both accessing and rechunking data.

Chunking Data: Why it Matters

Jan 29, 2013

What is data chunking? How can chunking help to organize large multidimensional datasets for both fast and flexible data access? How should chunk shapes and sizes be chosen? Can software such as netCDF-4 or HDF5 provide better defaults for chunking? If you're interested in those questions and some of the issues they raise, read on ...