[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #QUN-641037]: dimension ID ordering assumptions



Hi Charlie,

> Sorry to write all this on a Friday. Is the file
> 
> http://dust.ess.uci.edu/tmp/foo.nc
> 
> a valid netCDF file?

Good question.  It's an interesting example, and I'm curious how you
created it.  As you know, the C API indicates that the dimension IDs
of its 3 dimensions are 2, 6, and 20.

> I ask because NCO (ncks) cannot read it, yet ncdump can.

The nccopy utility can also read it and copy it.  Although the C API
sees the dimension IDs in foo.nc as 2, 6, and 20, the corresponding
dimension IDs in the copy are 0, 1, and 2.

> nc_inq() tells ncks that the file has three dimensions (which is true)
> and so ncks _assumes_ those dimensions have dimensions IDs 0..2.
> Then the nc_inq_dimname() call for dimension ID = 0 fails with
> "-46 NetCDF: Invalid dimension ID or name".
> 
> Is the assumption that dimension IDs are numbered from 0..N-1 correct
> for netCDF4 files, at least for those containing only data in the
> default (top-level) group? If not, what it is the new algorithm?

As you know, the values of dimension IDs depend on which API is in
use.  For classic format data, C IDs start at 0 and Fortran IDs
start at 1.

The C Users Guide says (in section 4.1):

   In the C interface, dimension IDs are 0, 1, 2, ..., in the order in
   which the dimensions were defined.

which I think is how the library does it, even with groups.  So I
don't think you could create a file with only 3 dimension IDs of 2, 6,
and 20 using our reference implementation of the netCDF APIs.
However, if you had a file with multiple groups, you could create 21
dimensions in the two groups in an order such that the dimension IDs
in one of the groups were 2, 6, and 20.

The statement in the C Users Guide in section 4.3:

   If ndims is the number of dimensions defined for a netCDF dataset,
   each dimension has an ID between 0 and ndims-1.

is also true, as far as I know, for any files created using our
reference implementation of the API, as long as it's understood that
the statement is for a whole netCDF file, and not just one group.

When I wrote nccopy, I accounted for the possibility within a group that
some of the dimensions might be inherited from an ancestor group, so the
dimension IDs might not be contiguous, with this code fragment:

     stat = nc_inq_ndims(igrp, &ndims);
     CHECK(stat, nc_inq_ndims);
 ...
  /* In netCDF-4 files, dimids may not be sequential because they
    * may be defined in various groups, and we are only looking at one
    * group at a time. */
    /* Find the dimension ids in this group, don't include parents. */
    dimids = (int *) emalloc((ndims + 1) * sizeof(int));
    stat = nc_inq_dimids(igrp, NULL, dimids, 0);
    CHECK(stat, nc_inq_dimids);

I propose that we should clarify this and permit examples such as your
foo.nc, with non-contiguous dimension IDs, to be explicitly permitted
as netCDF-4 files.  However netCDF-4 classic model files should obey
the classic model rule of contiguous IDs.  Developers for software
handling netCDF-4 files will have to deal with the possibility of
non-contiguous dimension IDs within a group anyway, so I don't think
there is any extra burden to permitting non-contiguous IDs for
netCDF-4 files.

I'll have to see if Ed and Dennis agree.  And I'd also like to know if
you think it would be more difficult to handle files like foo.nc in
NCO, given that you need to handle non-contiguous dimension IDs in
groups?

--Russ

> zender@givre:~$ ncks foo.nc
> nco_err_exit(): ERROR Short NCO-generated message (usually name of
> function that triggered error): nco_inq_dimname()
> nco_err_exit(): ERROR Error code is -46. Translation into English with
> nc_strerror(-46) is "NetCDF: Invalid dimension ID or name"
> nco_err_exit(): ERROR NCO will now exit with system call abort()
> Abandon
> zender@givre:~$ ncdump foo.nc
> netcdf foo {
> dimensions:
> lat = 2 ;
> lon = 4 ;
> time = UNLIMITED ; // (2 currently)
> variables:
> float lat(lat) ;
> lat:long_name = "Latitude (typically midpoints)" ;
> lat:units = "degrees_north" ;
> float lon(lon) ;
> lon:long_name = "Longitude (typically midpoints)" ;
> lon:units = "degrees_east" ;
> float three_dmn_rec_var(time, lat, lon) ;
> three_dmn_rec_var:long_name = "three dimensional record
> variable" ;
> three_dmn_rec_var:units = "watt meter-2" ;
> double time(time) ;
> time:long_name = "time" ;
> time:units = "days since 1964-03-12 12:09:00 -9:00" ;
> time:calendar = "gregorian" ;
> 
> // global attributes:
> :Conventions = "CF-1.0" ;
> :history = "Fri Oct  8 13:08:15 2010: ncks -4 -O -F -D 9
> -v three_dmn_rec_var -d time,1,10,5
> http://motherlode.ucar.edu:8080/thredds/dodsC/testdods/in_4.nc
> /home/zender/foo.nc\nHistory global attribute.\n" ;
> :julian_day = 200000.04 ;
> :RCS_Header = "$Header$" ;
> :NCO = "20101008" ;
> data:
> 
> lat = -90, 90 ;
> 
> lon = 0, 90, 180, 270 ;
> 
> three_dmn_rec_var =
> 1, 2, 3, 4,
> 5, 6, 7, 8,
> 41, 42, 43, 44,
> 45, 46, 47, 48 ;
> 
> time = 1, 6 ;
> }
> 
> 

Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: QUN-641037
Department: Support netCDF
Priority: Normal
Status: Closed