netCDF library

Bill Noon noon at snow.nrcc.cornell.edu
Tue Aug 1 17:46:57 MDT 2006


Nilesh -- As Dave Allured pointed out, if you want to use a standard  
netCDF format, your options are limited.

We were facing a similar dilemma in the need to efficiently store large  
amounts of climate data and we opted to create netCDF variant that kept  
the data compressed and used an index to uncompress small blocks of the  
data, as requested.  We have been using it most successfully for over a  
decade and most of the Regional Climate Centers' data is stored as  
'compressed netCDF'.  We generally see a 90-97% reduction in file sizes  
with benchmarked access being equal or slightly faster than standard  
netCDF files (especially over a network or off slower storage devices).

If you do go this route, you have to realize that you are on your own  
and you will have to uncompress any files you want to send to other  
researchers.

Given your particular situation, you may be more interested in looking  
at some other options:
	1. HDF5 is supposed to have a compressed format and an interface  
similar to netCDF.

	2. If you are using linux, you may be able to use a compressed file  
system in loopback mode to keep the netcdfs compressed but access them  
using the standard netCDF libraries.  This is effectively what my  
library modifications do on a per-file basis vs. per-filesystem.  This  
is probably most effective in a read-only situation.

Just some thoughts.

--Bill Noon
Northeast Regional Climate Center
Cornell University



On Aug 1, 2006, at 12:29 PM, Nilesh Lahoti wrote:

> Dear Sir,
>
> We are air quality modeling group at Rutgers University, New Jersey.  
> We are processing emissions and running simulation models for our  
> study of long range transport of Ozone and Particulate matter for our  
> research and for regulatory work.
>
> The netCDF library works great for us. However, I came across with one  
> particular issue of netCDF and would like to discuss if there are any  
> solution to this problem or something that can do to make its  
> performance better. When we process emissions for our three  
> dimensional grid of size (172 x 172 x 22) for 24 hours time period  
> having hourly data, the file size is around 1 gigabyte(GB). There are  
> several cells that have zero values and therefore the floating point  
> value for pollutants in netCDF file has zero values. When we use gzip  
> utility on unix to compress this files, the file size become almost 10  
> MB which saves us 99% of disk space. Now the question arise that if  
> the netCDF is most compress scientific format, than is it possible to  
> suppress this zero values of the floating point variable or is there  
> any switch that can be used to handle zero values and reduce file size  
> by any chance.
>
> Looking forward to hear from you.
>
> from,
>
> Nilesh Lahoti
> Research Specialist
> CCL, EOHSI,
> Rutgers University
> Email: nilesh at fidelio.rutgers.edu
> Phone: 732-445-1416
>
> ======================================================================= 
> =======
> To unsubscribe netcdfgroup, visit:
> http://www.unidata.ucar.edu/mailing-list-delete-form.html
> ======================================================================= 
> =======
>

==============================================================================
To unsubscribe netcdfgroup, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================



More information about the netcdfgroup mailing list