*To*: netcdfgroup@xxxxxxxxxxxxxxxx*Subject*: Discussion of NetCDF limitations*From*: PEPKE@xxxxxxxxxxxx*Date*: Mon, 23 Aug 1993 16:40:31 -0400 (EDT)

We have gotten NetCDF and the 2.3 document and have implemented a file reader for our visualization package (SciAn), based on my understanding of the documentation. I found that it could do a few things, but that it couldn't do a lot of things that I need from a self-descriptive scientific data format. Some of the things that it can't do could be fixed with ad-hoc attributes, but that approach is only useful if one has complete control over both the writer and the reader of the data. So, I'd like to start a discussion first of all to figure out if my interpretation is correct, and second to try to figure out ways of improving it to do more things of interest. I place a high value on not saying "let the user determine it at run-time," because the more things that can be self-descriptive, the better. Here is my interpretation of what it does and doesn't do. The does list is not meant to be all-inclusive, just to include the things which I currently think are important, which list is guaranteed to change over time. :-) NetCDF does: A) Scalar fields with any number of fixed dimensions, anything that can be written var = f(i, j, k, ...). (The NetCDF calls this multidimensional data, but the fact is that there are three uses of the term "dimension" that are important here. The first is topological or computational dimension, which is what "multidimensional" variables seem to do. The second is degrees of freedom in the data points, which is like "at this point i, j, k it's a scalar giving density or it's a 3-vector giving velocity.) B) Rectilinear grids with separable axes, anything that can be written x = f(i), y = g(k), ... (The third meaning of "dimension" is spatial dimension. There is a mapping between topological or computational dimensions and spatial dimensions.) C) Missing data D) Axis names and units, also names and units of scalar fields E) Range of scalar values F) Flexible time-dependency using one unlimited dimension G) A variety of data formats including byte NetCDF doesn't do: 1) Vector fields 2) Tensor fields 3) Handedness of coordinate system (i.e. lat, long, elev is left-handed) 4) Choice of mapping of dimensions onto spatial dimensions 5) Spatially transformed coordinate systems (i.e. polar coordinates, crystal axis coordinates). 6) Curvilinear coordinates (i.e. x = f(i, j, k, ...), y = g(i, j, k, ...) 7) Unstructured grids (e.g. finite element) 8) Scatter data 9) Modulo data elements (e.g. 357, 358, 359, 0, 1, 2) 10) Arbitrary, automatic mapping of byte elements onto real numbers. This is useful because some data formats, such as NEXRAD, give you 256 real numbers and then a bunch of bytes. NetCDF does the bytes; what is needed is a way to do the mapping. Finally, here are the ways that we are approaching these problems (or not, as the case may be). I hope this can spark some discussion. If you see (=HDF), that means we're using basically the same strategy as we do for similar limitations in HDF. 1) If the first or last non-unlimited dimension is 2 or 3, and there is no scale dimension associated with it, assume that the dimension chooses the vector component rather than an additional topological or computational dimension. Allow this to be defeated using a check box in the file reader control panel. (=HDF) 2) We don't do tensors yet, but if we did, it would be something like 1). 3) and 4) Use an external file to map particular dimension names onto spatial dimensions (i.e. latitude = y, longitude = x for Mercator projection). Handedness is implicit by the axes chosen. (=HDF) 5) This could be done by defining new attributes. In HDF, it's done using the coordinate system for easy stuff, but there is no way to do skewed axes. 6) I'm not sure how to do this. However, a curvilinear grid can be defined completely by a vector field defined over a set of topological or computational dimensions. Assuming a way to do this, all that is needed is a a way to link a field with the vector field that defines the grid. This could be done with an attribute of the field giving the name of the grid field which could be searched for in the file. There's also the issue of mixed curvilinear coordinates. In meteorology we often get a case where x = f(i), y = g(j), and z = h(i, j, k). This is to do a grid that matches the terrain at the bottom and is a flat elevation above sea level at the top. This requires the flexibility of a curvilinear grid but there are optimizations to make a search of the grid as fast as a rectilinear grid. It would be nice to take advantage of this. It would, at least, require different variables for the three spatial dimensions and a way of linking the variables into one complete variable. 7) No idea. The positions of numbered grid points could be represented easily enough with a 1-D vector sequence, although this kind of variable definition might confuse the heuristic in 1, if there were, for example, a grid with 3 vertices. However, the problem of how to do the connectivity for edges, faces, cells, and hypercells is open. 8) Essentially the same problem as a nonstructured grid with a topological dimension of 0, i.e., no connectivity. 9) In HDF, we do the wrapping based on our knowledge of the coordinate system and also the name of the units (degrees, radians, gradians). In NetCDF we could do the same on the unit names, but we wouldn't have the coordinate system information to help the heuristic. 10) One way to do this is to do the mapping as just another variable with dimension 256 and find some attribute way of attaching the two together. Another way would be to extend NetCDF so that *a variable could be used as a data type*. For example, something like: dimensions: x = 10, y = 10, funcMap = 256; variables: float funcMap(funcMap); funcMap field(x, y); data: funcMap = 1.0, 2.0, 3.0, ...; field = 4, 6, 8, 1, 37, ...; I have no idea if this is possible to represent internally or not, but it drops right out of the syntax of CDL. It is, however, evil. Eric Pepke INTERNET: pepke@xxxxxxxxxxxx Supercomputer Computations Research Institute MFENET: pepke@fsu Florida State University SPAN: scri::pepke Tallahassee, FL 32306-4052 BITNET: pepke@fsu Disclaimer: My employers seldom even LISTEN to my opinions. Meta-disclaimer: Any society that needs disclaimers has too many lawyers.

- 1993 messages navigation:
- Sorted by: [ thread ] [ subject ] [ author ] [ date ]
- Archive table of contents

`netcdfgroup`

list information:- More information on the
`netcdfgroup`

list - Subscribe to this mailing list

- More information on the
- Search entire
`netcdfgroup`

archives: