Hi Jessica, > My question is: Has there been any type of study or testing which > documents the conversion of array based data to NetCDF4 and what > this conversion does to the file size? For example, if I have an 40 > GB array of data and I want to put it into NetCDF4, how large would > the NetCDF4 file be? Yes, we've used the nccopy utility to convert netCDF classic format files to netCDF-4 files and noticed that there is a relatively high overhead for metadata (the file schema, names, and attribute values), but with large files that are mostly data, the netCDF-4 files are very close to the same size as the netCDF classic format files. If you take advantage of compression available in netCDF-4 files, they can be significantly smaller, depending on the data. Getting optimum compression can be tricky, because it can be improved by configuring "chunking" parameters in ways that take advantage of characteristics of the data, but most data that's not just random numbers can be compressed. Whether the time it takes to compress the data on writing and uncompress it on reading is worth the storage savings depends on how the data will be used. If you know something about how the data will be accessed (e.g. in horizontal slices of a 4D variable, or as time series for a set of grid points), you can configure the chunking parameters to minimize the amount of times data is uncompressed and make sure only the data that is accessed (or a little bit more) is uncompressed when it is read. I'm just now adding the ability to nccopy to write compressed copies, so it will be easier to experiment with compression to determine whether it's worth the trouble. The new nccopy utility should be available in the upcoming 4.1.2 release. In the meantime, you can try this out yourself by using one of the contributed utilities based on nccopy that were described in these two user posts to the netcdfgroup mailing list: http://www.unidata.ucar.edu/mailing_lists/archives/netcdfgroup/2010/msg00270.html http://www.unidata.ucar.edu/mailing_lists/archives/netcdfgroup/2010/msg00271.html or you could just try using the new library compression APIs documented here: http://www.unidata.ucar.edu/netcdf/docs/netcdf-c.html#nc_005fdef_005fvar_005fdeflate --Russ Russ Rew UCAR Unidata Program address@hidden http://www.unidata.ucar.edu Ticket Details =================== Ticket ID: HPH-893418 Department: Support netCDF Priority: Normal Status: Closed
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.