overcoming netcdf3 limits

Quincey Koziol koziol at hdfgroup.org
Wed Apr 25 20:00:58 MDT 2007


On Apr 24, 2007, at 10:25 PM, ncdigest wrote:

> Date: Tue, 24 Apr 2007 15:54:43 -0600
> From: "Greg Sjaardema" <gdsjaar at sandia.gov>
> Subject: Re: overcoming netcdf3 limits
>
> This is a multi-part message in MIME format.
> - --------------080905080309050908000406
> Content-Type: text/plain;
>  charset=iso-8859-1
> Content-Transfer-Encoding: 7bit
>
> Ed Hartnett wrote:
>> robl at mcs.anl.gov (Robert Latham) writes:
>>
>>
>>> Hi
>>>
>>> Over in Parallel-NetCDF land we're running into users who find even
>>> the CDF-2 file format limitations, well, limiting.
>>>
>>> http://www.unidata.ucar.edu/software/netcdf/docs/netcdf/NetCDF-64- 
>>> bit-Offset-Format-Limitations.html
>>>
>>> http://www.unidata.ucar.edu/software/netcdf/docs/faq.html#Large% 
>>> 20File%20Support10
>>>
>>> If we worked up a CDF-3 file format for parallel-netcdf (off the top
>>> of my head, maybe a 64 bit integer instead of an unsigned 32 bit
>>> integer could be used to describe variables), would the serial  
>>> netcdf
>>> folks be interested, or are you looking to the new netcdf-4  
>>> format to
>>> take care of these limits?
>>>
>>> Thanks
>>> ==rob
>>>
>>>
>>
>> Howdy Rob!
>>
>> Your email has generated a lot of discussion here, and we are
>> formulating our response.
>>
>> However, another question: have you considered using netCDF-4? It  
>> does
>> not have the limits of the 64-bit offset format, and support parallel
>> I/O, as well as a number of other features (groups, compound data
>> types) which might be helpful in organizing really large data sets.
>>
>> Since it uses the netcdf-3 API (with some extensions) it should be
>> possible to easily convert code to use netCDF-4...
>>
>> Thanks,
>>
>> Ed
>>
> I have been following the netcdf-4 development very closely.  It has
> some good points, especially the elimination of the dataset limits.
> I've generated a 300-million element mesh with the latest release that
> wouldn't be possible with the netcdf-3 format.
>
> However, there is concern about the robustness of the underlying HDF5
> format.  It is possible to corrupt the entire file if there is a crash
> at the wrong time.  We cannot build our production system on a library
> that has this behavior.  Some of the systems we run on are not  
> known for
> their stability and if a job that has been running for a few days
> crashes and loses all data, that is not acceptable.  With the netcdf-3
> library, we would lose all or a portion of the last "time dump"  
> written,
> but not previous data that had been synced to disk.  I was also a  
> little
> concerned with the long time that it took for hdf5-1.8.0 to make it to
> the beta phase...
>
> We are definitely looking at the netcdf-4 effort, but are also looking
> at other solutions...
> - --Greg

Hi all,
	Speaking from the HDF5 side of things - we are working on  
implementing a solution to the problem of HDF5 files getting  
corrupted when a process crashes.  Essentially, we are adding  
journaling to HDF5 metadata operations (similar to how file systems  
operate).  This will allow files that have been potentially corrupted  
by an application crash to "replay" the journal and recover all the  
changes to the metadata in a file, up to the point of the crash.  At  
the same time, we are also making metadata I/O operations  
asynchronous (starting with serial I/O, then working on parallel I/ 
O), which should speed up I/O quite a bit.

	Quincey Koziol
	The HDF Group

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2448 bytes
Desc: not available
Url : http://mailman.unidata.ucar.edu/mailing_lists/archives/netcdfgroup/attachments/20070425/7cbebbe9/attachment.bin 


More information about the netcdfgroup mailing list