NOTE: This article was published back in 2012 as an April Fool's Day joke. Six years later, folks searching for information about netCDF see the part about version 5 and assume that the article is real, without reading to the end of the post where it says "P.S. April Fools." So here's your warning — this is a joke!
Most people know that the netCDF-4 format uses HDF-5 as the underlying file format. With chunking and compression, large datasets may be 2-10 times smaller than the same data stored in the netCDF-3 format. However, we have not been able to reach the compression efficiency of GRIB-2, which uses dynamic scale/offsets to turn floating point numbers into integers, and JPEG2000 wavelet compression to store the integers very efficiently. Carefully tuned GRIB-2 may be 40 times smaller than netCDF-3.
After a careful study of the options in how to compete with GRIB2 for storage size efficiency, working in collaboration with NCAR's Research Application Laboratory, we are glad to announce that the next version of netCDF, which we call netCDF-5, will be based on the GRIB-2 format.
GRIB-2 also`has the tremendous advantage of not needing to store metadata directly in the files, instead storing just numeric references to controlled vocabulary in external tables. These external tables are controlled by appropriate governing authorities, so that uniform metadata and naming conventions is always assured. NetCDF-5 will also take this superior approach.
Finally, the GRIB-2 data model is an unordered collection of 2D data slices, instead of the much more complex multidimensional arrays from netCDF. One advantage of this is that data can be stored in any order, across different files. This allows users to store important information in the file name, so that applications know exactly what is in the file without having to open it. We expect netCDF-5 to follow this tried-and-true method, and we will be developing a set of translators to rewrite older netCDF formats into GRIB-2/netCDF-5, with the CF metadata stored directly in the filenames.
Stay tuned to this blog, where we will be releasing more details as we implement this important new advance in scientific data formats.
(P.S. April Fools . . . here's a discussion of the realities of GRIB: GRIB and BUFR as Archival Data Formats?)