[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 950926: netCDF: Writing to a tape?



>Organization: MIT Lincoln Laboratory
>Keywords: 199509261903.AA27182

Hi Jennifer ,

> I've been searching the archives and haven't found a definitive answer
> to the following question.  My apologies if it's been asked before.
> 
> Is it possible to create and write a netCDF file directly to a mag tape?

In general, no, because netCDF is a direct-access I/O library, which means
it must be able to seek to fairly arbitrary locations in a file to write or
read data.  In particular, netCDF permits you to write an array in a
different order from which it is read.  For example, you can write a matrix
by columns and read it back by rows.  The direct-access (sometimes called
"random-access") features of the interface permit you to efficiently access
small subsets of data from very large datasets, and in particular to read
the n-th record of a record variable without first reading through the first
n-1 records.  

If you have a tape device that permits using the "lseek()" call to position
to an arbitrary byte boundary (subject to alignment constraints, which
really means seeks are restricted to 4-byte boundaries), then netCDF can be
implemented to read and write it directly.  Most tape drives cannot do
anything like this, they can only seek to some sort of tape record
boundaries, which are typically much larger.

> If so, is it generally a bad idea?  The general concensus around here is
> that the header isn't necessarily written before the data begins, so it'd
> be difficult, if not impossible, but we'd like to hear the answer from
> the horse's mouth, as it were...

That's another consideration that makes it a bad idea.  NetCDF files are
appendable, which means you can open an existing file and add some more data
to variables that use the unlimited dimension without recopying the entire
file.  However when you do this, it not only lengthens the file, but also
updates the current size of the unlimited dimension, which is located in the
header.  Since the header is at the front of the file, this would involve
rewinding the tape to update this number in the header after each write of a
record at the end, which would make for fairly inefficient tape access.

It's better to read and write netCDF files on a device capable of direct
access, like a disk, and then stage them from and to magnetic tape before or
after you access the data.

--Russ "The Horse's Mouth" Rew

______________________________________________________________________________

Russ Rew                                           UCAR Unidata Program
address@hidden                              http://www.unidata.ucar.edu