Re: 20010808: netcdf 3.4 help

To: address@hidden
From: "Alan S. Dawes" <address@hidden>
Subject: netcdf 3.4 help
>Organization: UCAR/Unidata
Keywords: 200108082235.f78MZM110902, huge files, large file support, record

Hi Alan,

Here's the CDL for a small file that has two "record variables", x and
y, of different shapes:

 netcdf big2 {
         m = 4 ;
         n = 5 ;
         r = UNLIMITED ; // (3 currently)
         float x(r, m) ;
         float y(r, n) ;

  x =
   2, 3, 4, 5,
   3, 4, 5, 6,
   4, 5, 6, 7 ;

  y =
   1, 2, 3, 4, 5,
   2, 4, 6, 8, 10,
   3, 6, 9, 12, 15 ;

A small Fortran program that will write the netCDF file corresponding
to the above CDL is appended.  This program was mostly generated by
the "ncgen -f" utility, except I edited the output from that utility a
bit for this example.

In this example, if r was an ordinary dimension declared to be of
length 3, then all the values for x would be stored in the file
followed by all the values of y.  However, since r is declared to be
the unlimited dimension, the first slice of x (corresponding to r=1)
is followed by the first slice of y, then the second slice of x and y,
and so on.  But all your netCDF data access calls for reading and
writing the data are the same as if r was a fixed size dimension.
It's just that with r an UNLIMITED dimension, the data is organized
differently in the file and its possible to append more data in the r
direction efficiently.

To have this program generate a 6 Gbyte file, corresponding to the
similar CDL:

 netcdf big2 {
         m = 400000 ;
         n = 600000 ;
         r = UNLIMITED ; // (1500 currently)
         float x(r, m) ;
         float y(r, n) ;

  x =
   ...  // long list of values

  y =
   ...  // long list of values

it's only necessary to change the three parameters in the Fortran
program to


and link the resulting Fortran against the netCDF library compiled
with large file support.  So even though x is a 1500 x 400000 array of
600,000,000 floats (requiring 2.4 Gbytes to store) and y is a 1500 x
600000 array of 900,000,000 floats (requiring 3.6 Gbytes to store),
both variables can be written into and read from the netCDF file,
because they are record variables, only stored a slice at a time, with
the x slice for r=1 followed by the y slice for r=1, ...  To simplify
this example, I haven't included any fixed size variables, but they
don't really change anything as long as the total size of all fixed
size variables is < 2 GBytes.

I hope this clarifies one way to write very large netCDF files on a
32-bit platform with large file support.  I don't think any special
Fortran flags are required for this, but the C library had to be built



      program fgennc
      include 'netcdf.inc'
* error status return
      integer  iret
* netCDF id
      integer  ncid
* dimension ids
      integer  m_dim
      integer  n_dim
      integer  r_dim
* dimension lengths
      integer  m_len
      integer  n_len
      integer  r_len
      parameter (m_len = MFIXED)
      parameter (n_len = NFIXED)
      parameter (r_len = NF_UNLIMITED)
* variable ids
      integer  x_id
      integer  y_id
* rank (number of dimensions) for each variable
      integer  x_rank
      integer  y_rank
      parameter (x_rank = 2)
      parameter (y_rank = 2)
* variable shapes
      integer  x_dims(x_rank)
      integer  y_dims(y_rank)
* data variables
      real  x(m_len)
      real  y(n_len)
* starts and counts for array sections of record variables
      integer  x_start(x_rank), x_count(x_rank)
      integer  y_start(y_rank), y_count(y_rank)

* enter define mode
      iret = nf_create('big2.nc', NF_CLOBBER, ncid)
      call check_err(iret)
* define dimensions
      iret = nf_def_dim(ncid, 'm', MFIXED, m_dim)
      call check_err(iret)
      iret = nf_def_dim(ncid, 'n', NFIXED, n_dim)
      call check_err(iret)
      iret = nf_def_dim(ncid, 'r', NF_UNLIMITED, r_dim)
      call check_err(iret)
* define variables
      x_dims(2) = r_dim
      x_dims(1) = m_dim
      iret = nf_def_var(ncid, 'x', NF_REAL, x_rank, x_dims, x_id)
      call check_err(iret)
      y_dims(2) = r_dim
      y_dims(1) = n_dim
      iret = nf_def_var(ncid, 'y', NF_REAL, y_rank, y_dims, y_id)
      call check_err(iret)
* leave define mode
      iret = nf_enddef(ncid)
      call check_err(iret)
* Write record variables one record at a time
      do irec=1, NUMRECS
*     store some arbitrary values in data variable slices
         do ix = 1, m_len
            x(ix) = ix + irec
         do iy = 1, n_len
            y(iy) = iy * irec

*     store x slice
         x_start(1) = 1
         x_start(2) = irec
         x_count(1) = m_len
         x_count(2) = 1
         iret = nf_put_vara_real(ncid, x_id, x_start, x_count, x)
         call check_err(iret)
*     store y slice
         y_start(1) = 1
         y_start(2) = irec
         y_count(1) = n_len
         y_count(2) = 1
         iret = nf_put_vara_real(ncid, y_id, y_start, y_count, y)
         call check_err(iret)
      iret = nf_close(ncid)
      call check_err(iret)

      subroutine check_err(iret)
      integer iret
      include 'netcdf.inc'
      if (iret .ne. NF_NOERR) then
      print *, nf_strerror(iret)