News

Latest news from NSF Unidata and the community

NetCDF Compression

Mar 31, 2014

The steady state of disks is full. --Ken Thompson

Introduction

From our support questions, it appears that the major feature of netCDF-4 attracting users to upgrade their libraries from netCDF-3 is compression. The netCDF-4 libraries inherit the capability for data compression from the HDF5 storage layer underneath the netCDF-4 interface. Linking a program that uses netCDF to a netCDF-4 library allows the program to read compressed data without changing a single line of the program source code. Writing netCDF compressed data only requires a few extra statements. And the nccopy utility program supports converting classic netCDF format data to or from compressed data without any programming.

Chunking Data: Choosing Shapes

Mar 28, 2013

In part 1, we explained what data chunking is about in the context of scientific data access libraries such as netCDF-4 and HDF5, presented a 38 GB 3-dimensional dataset as a motivating example, discussed benefits of chunking, and showed with some benchmarks what a huge difference chunk shapes can make in balancing read times for data that will be accessed in multiple ways.

In this post, I'll continue looking at that example dataset to see how we can derive good chunk shapes, generalize to other datasets, look at how long it can take to rechunk a multidimensional dataset, and look at the use of Solid State Disk (SSD) for both accessing and rechunking data.

Chunking Data: Why it Matters

Jan 29, 2013

What is data chunking? How can chunking help to organize large multidimensional datasets for both fast and flexible data access? How should chunk shapes and sizes be chosen? Can software such as netCDF-4 or HDF5 provide better defaults for chunking? If you're interested in those questions and some of the issues they raise, read on ...

News

Latest news from NSF Unidata and the community

NetCDF Compression

Introduction

Chunking Data: Choosing Shapes

Chunking Data: Why it Matters

Developments in NetCDF C Library For 4.1.2 Release

Proof New Default Chunk Cache in 4.1 Improves Performance

Why I Don't Think the Number of Processors Affects the Wall Clock Time of My Tests...

Large-Enough Cache Very Important When Reading Compressed NetCDF-4/HDF5 Data

Smaller Chunk Sizes For Unlimited Dimension

The Point of All These Tests

Narrowing In On Correct Chunksizes For the 3D AR-4 Data

NSF NCAR

UCAR