This is Info file conventions.info, produced by Makeinfo-1.43 from the input file conventions.texinfo. Unidata NetCDF Conventions -- Draft (9 August 1993) Introduction ============ This is a preliminary, incomplete, and evolving draft of Unidata conventions for netCDF files. It describes conventions for netCDF names, types, and values for variables, dimensions, and attributes. In general, programs that read and write netCDF files should be made as generic as practical, without hard-coding specific names of variables, dimensions, or attributes. One way to accomplish this is through the use of "resources" (as the concept is used in the MIT X Window System). Resources may be used for specifying persistent defaults that are suitable for multiple invocations of an application, for defaults that are used by multiple applications, or for customizations that are too minor or too infrequently used to justify a command-line parameter. For example, the name of the netCDF dimensions that will represent latitude and longitude for earth-referenced data are candidates for resources, since they will be shared by multiple applications and rarely changed, but they need to be customizable for international use. Of course for applications to work together well, the names of resources for common defaults and application parameters will have to be agreed upon. What is the advantage of not hard-coding the name of the global `history' attribute in a netCDF program if the name to use for this attribute must be retrieved through a hard-coded attribute name `netcdf.histname'? One advantage is that resources values can be defined in one place that is an editable text file, so that changing resource values requires only a text editor or resource editor rather than recompiling or relinking application programs. It is also possible to use a customizable hierarchy of shared resource files, so that site-specific resource specifications override application-specific defaults, and are in turn overriden by user-specific resource specifications or command-line parameters. Unidata has developed and documented an application programming interface, `udres', and a library implementing the `udres' interface, to support the use of resources in combination with mechanisms for handling command-line arguments to appications. This library is used to handle command-line parsing and getting defaults from resource files in the netCDF operators, currently under development. Names ===== The netCDF library places no restrictions on the length or the characters used in netCDF names, but netCDF utilities and operators assume that names begin with an alphabetic character, are no longer than `MAX_NC_NAME' characters (currently defined in `netcdf.h' as 128), and use only the alphanumeric characters and the underscore `_' and hyphen `-' characters. Names that begin with an underscore are reserved for special interpretation by netCDF utilities and operators. Case of alphabetic characters is significant in all dimension, attribute, and variable names. Names which differ only in case can lead to confusion, however, so should be avoided without good reason. There is a weak convention to use all caps for names that represent observed or measured parameters and lower case for derived parameters or modifiers of observed parameters. For example, `reftime_PRECIP' might be a name for the reference time for a precipitation measurement. It is good practice not to encode information in variable names that is better represented as attributes. For example `TEMPC' is not a good name for temperature in celsius, since a netCDF operator cannot deal with units unless they are specified in the units attribute. No strict guidelines are possible for variable names, but the use of conventional names will make some operations simpler, e.g. the comparison or merging of variables in different data sets. Discipline-specific tables of synonyms might also facilitate such operations. The following list contains some suggested types, names, and brief descriptions of variables for atmospheric and oceanographic parameters. These names are currently in use in Unidata netCDF files. This table does not include dimensions for variables, since these vary depending on use. In particular, variables whose type is given as `char' are actually arrays of characters, i.e. strings. Dimension names follow the same guidelines as variable names. An additional convention is to use the suffix `_len' for dimension names that are really used as maximum character-string lengths for character variables, for example `id_len'. long sday actual image start date (GOES) long stime actual image start time (GOES) long sscan actual starting scan (GOES) float ALTIM altimeter setting float height altitude associated with an echo-object (MDR) float z_sigw altitude - Significant Levels wrt Winds (UA) long iseang angle between VISSR & sun sensor (GOES) long perigee argument of perigee (GOES) float ave_power_low average transmitted power, low mode (profiler) float ave_power_high average transmitted power, high mode (profiler) float azim_n azimuth of north beam (profiler) float azim_e azimuth of east beam (profiler) float azim_v azimuth of vertical beam (profiler) float azimuth bearing associated with an echo-object (MDR) char ctype calibration type: "BRIT", "RAW " ... (GOES) int grib_center center ID from Stackpole's Table 1 char cloudtype cloud type float ZCL cloudbase byte CC cloudcover int config configuration pattern of the echo system (MDR) float cosmic_noise cosmic background noise temperature (profiler) char country country int day day of month, 1-31 long declin declination of satellite axis float reftime_delP delP reference period float snow depth of snow cover float TD dew point float td_man dew Point - Mandatory Levels float td_sigt dew Point - Significant Levels wrt T or RH float u eastward wind component byte object echo-object (MDR) float elev elevation of station float elev_n elevation of north beam (profiler) float elev_e elevation of east beam (profiler) float elev_v elevation of vertical beam (profiler) long etimy epoch date long etimh epoch time float equiv_noise equivalent noise temperature (profiler) char id five-character NWS name int frtime forecast time long pitch forward-leaning float frequency frequency of radar (profiler) int year full year, e.g. 1991 float gain gain, decibels relative to isotropic (profiler) float Z geopotential height float z_man geopotential - Mandatory Levels float alt height above mean sea level float low_level height above station for low mode (profiler) float high_level height above station, high mode (profiler) float high_vgw high mode vertical gate width (profiler) int hour hour of day, 0-23 long yres image line resolution long xres image element resolution int trend intensity trend of the precipitation (MDR) byte intensity intensity of the precipitation, echo intensity (MDR) float lat latitude (north of equator) float lat latitude float LI lifted index float lon longitude (east of Greenwich) float lon longitude float low_vgw low mode vertical gate width (profiler) byte mdr manually digitized radar summary (MDR) float Tmax maximum temperature long meana mean anomaly float Tmin minimum temperature int minute minute of hour, 0-59 int grib_model model ID from Stackpole's Table 2 int month month of year, 1-12 float speed movement speed associated with an echo-object (MDR) float from movement direction associated with an echo-object (MDR) char navtype navigation type: "GOES" byte nazran no. "azimuth/range" slots by each echo-object (MDR) byte nheight no. of "height" slots used by each echo-object (MDR) byte nmove no. of "movement" slots used by each echo-object (MDR) byte nwidth no. of "width" slots used by each echo-object (MDR) long navtime nominal start of image long ndate nominal year and day-of-year of area long ntime nominal time of area float v northward wind component long lintot number of scan lines (NN = number long eletot number of elements in a scan line long ysize number of lines in area image long xsize number of elements in area image long nbyte number of bytes per data element long num_mant number of Mandatory Levels long num_sigt number of Significant Levels wrt T long num_sigw number of Significant Levels wrt W int opstat operational status of the radar (MDR) long orbtype orbit type (always 1) long eccen orbital eccentricity long orbinc orbital inclination byte W1,W2,... past weather float reftime_Wn past weather (Wn) reference time long piclin picture center line number byte image pixel image float reftime_PRECIP precip reference period byte precip precipitation types in a coverage group (MDR) float PRECIP precipitation amount float delP pressure change byte Ptend pressure tendency float p_man pressure - Mandatory Levels float p_sigt pressure - Significant Levels wrt T or RH float wave_dir primary wave direction float wave_per primary wave period byte coverage proportion of precipitation type in configuration (MDR) byte low_qual_indicator quality indicator, low mode (profiler) byte low_qual_summary quality summary, low mode (profiler) byte high_qual_indicator quality indicator, high mode (profiler) byte high_qual_summary quality summary, high mode (profiler) char radar_type radar type ("WSR_57", "WSR-74C", ..., "CPS-9") (MDR) float range range associated with an echo-object (MDR) char reftime reference time of forecasts char region region ID char region region identifier float RH relative humidity char remarks remarks long rpt_time report time (MDR) char type report Origination long asnode right ascension of ascending node long rascen right ascension of satellite axis long roll rotation long satid satellite identification number. float SST sea surface temperature float PSL sealevel pressure long semima semimajor axis char call_sign ship call sign or alphabetic station identifier long yaw sideways-leaning float wave_hgt significant wave height char stype source type: "VISR", "VAS ", "AVHR", "ERBE" ... long spinp spin period: either the period of a float level standard atmospheric level char state state or province float stn_lon station longitude (MDR) float stn_lat station latitude (MDR) float P station pressure char id station id, a string containing call sign or WMO number char id station ID float lat station latitude float lon station longitude float elev station elevation above MSL long mdr_time summary time (MDR) float sfc_p surface pressure (profiler) float sfc_t surface temperature (profiler) float sfc_td surface dew point (profiler) float sfc_spd surface wind speed (profiler) float sfc_dir surface wind direction (profiler) float sfc_rain surface rain accumulation (profiler) float T temperature float t_man temperature - Mandatory Levels float t_sigt temperature - Significant Levels wrt T or RH float sunshine time of equivalent solar radiation char time time (in text form, yyyy mm dd hh:mm zzz) long deglin total sweep angle, line direction long degele total sweep angle, element direction float total_noise total noise temperature (profiler) char rpt_type type of report: "SPL", "COR" (MDR) float low_u_wind u wind for low mode (profiler) float high_u_wind u wind, high mode (profiler) long ycoor upper left hand line in satellite cords long xcoor upper left hand element in satellite coords long zcoor upper left hand z-coordinate float low_v_wind v wind for low mode (profiler) float high_v_wind v wind, high mode (profiler) float omega vertical velocity float Tv virtual temperature float VIS visibility float low_w_wind w wind for low mode (profiler) float high_w_wind w wind, high mode (profiler) byte WX weather float width width associated with an echo-object (MDR) float GUST wind gusts float DIR wind direction float SPD wind speed float dir_man wind Direction - Mandatory Levels float spd_man wind Speed - Mandatory Levels float dir_sigw wind Direction - Significant Levels wrt Winds float spd_sigw wind Speed - Significant Levels wrt Winds long idn wmo Numeric Station ID long zres z resolution (number of bands/channels) float minus20_BW_low -20 dB bandwidth, low mode (profiler) float minus20_BW_high -20 dB bandwidth, high mode (profiler) float low_moment0_n 0th moment, low mode, N beam (profiler) float low_moment0_e 0th moment, low mode, E beam (profiler) float low_moment0_v 0th moment, low mode, vertical beam (profiler) float high_moment0_n 0th moment, N beam, high mode (profiler) float high_moment0_e 0th moment, E beam, high mode (profiler) float high_moment0_v 0th moment, vertical beam, high mode (profiler) float low_moment2_n 2nd moment, low mode, N beam (profiler) float low_moment2_e 2nd moment, low mode, E beam (profiler) float low_moment2_v 2nd moment, low mode, vertical beam (profiler) float high_moment2_n 2nd moment, N beam, high mode (profiler) float high_moment2_e 2nd moment, E beam, high mode (profiler) float high_moment2_v 2nd moment, vertical beam, high mode (profiler) char wmo_no WMO 5-digit station number Data Types ========== If there is a choice between floats and integers, it is generally better to use floats, all other things being equal. NetCDF operators will convert integer data to floats for some operations (e.g. taking the mean over a specified dimension). Variable Attributes =================== Generic applications that take netCDF files as input will, by convention, expect certain variable and global attributes. A few other attributes are handled in special ways by the netCDF library (these reserved attributes will have names that begin with a leading underscore character `_'). If you want to be able to use generic applications with your files, you should use the following conventional names for these commonly used attributes: `units' A character array that specifies the units used for the variable's data. A standard for conventional ways to name units in each specific discipline should be used, if available. Unidata has developed a freely-available library of routines to convert between character string and binary forms of unit specifications and to perform various useful operations on the binary forms. This library is used in some netCDF applications. Using the recommended units syntax permits data represented in conformable units to be automatically converted to common units for algebraic operations. `long_name' A long descriptive name. This could be used for labelling plots, for example. If a variable has no `long_name' attribute assigned, the variable name will be used as a default. `valid_range' An array of two numbers specifying the minimum and maximum valid values for this variable. The type of each `valid_range' attribute should match the type of its variable. `valid_min' `valid_max' One or both of these may be used instead of `valid_range'; this handles the case where it only makes sense to bound the data below or above. `scale_factor' If present for a variable, the data are to be multiplied by this factor after the data are read by the application that accesses the data. `add_offset' If present for a variable, this number is to be added to the data after it is read by the application that accesses the data. If both `scale_factor' and `add_offset' attributes are present, the data are first scaled before the offset is added. The attributes `scale_factor' and `add_offset' can be used together to provide simple data compression to store low-resolution floating-point data as small integers in a netCDF file. When scaled data are written, the application should first subtract the offset and then divide by the scale factor. When `scale_factor' and `add_offset' are used for packing, the associated variable (containing the packed data) is typically of type byte or short, whereas the unpacked values are intended to be of type float or double. The attributes `scale_factor' and `add_offset' should both be of the type intended for the unpacked data, e.g. float or double. `_FillValue' If a scalar attribute with this name is defined for a variable and is of the same type as the variable, it will be subsequently used as the *fill value* for that variable. The purpose of this attribute is to save the applications programmer the work of prefilling the data and also to eliminate the duplicate writes that result from netCDF filling in missing data with its default fill value, only to be immediately overwritten by the programmer's preferred value. This value is considered to be a special value that indicates missing data, and is returned when reading values that were not written. The missing value should be outside the range specified by `valid_range' for a variable. It is not necessary to define your own `_FillValue' attribute for a variable if the default "fill value" for the type of the variable is adequate. Note that if you change the value of this attribute, the changed value only applies to subsequent writes; previously written data are not changed. `missing_value' `missing_value' is a conventional name for a missing value that will not be treated in any special way be the library, as the `_FillValue' attribute is. It is also useful when it is necessary to distinguish between two kinds of missing values. For example, `_FillValue' might be useful to indicate data that was expected but did not appear, whereas `missing_value' might be used to indicate grid regions that are not intended to contain data. `signedness' Used to indicate a nondefault interpretation of the signedness of integer values. By default, applications that deal with values should treat netCDF byte data as unsigned and netCDF short and long integer data as signed. If you declare a netCDF variable for storing bytes, and you intend that the values represent signed quantities, you should declare the variable attribute `signedness' with value `"signed"'. Similarly, if you define a variable for an array of short or long integers and you intend that the values be interpreted as unsigned, it would be appropriate to define the variable attribute `signedness = "unsigned"'. This attribute is ignored by the netCDF library, but applications may use it. Since there are no standard FORTRAN types corresponding to unsigned integers, FORTRAN programs that compute with or use the ordering of data values may need to handle this attribute. `C_format' A character array for the format that should be used to print values for this variable by C applications. For example, if you know a variable is only accurate to three significant digits, it would be appropriate to define the `C_format' attribute as `"%.3g"'. The `ncdump' utility program uses this attribute for variables for which it is defined. `FORTRAN_format' A character array for the format that should be used to print values for this variable by FORTRAN applications. `title' A global attribute that is a character array providing a succinct description of what is in the data set. `history' A global attribute that is a character array with a line for each invocation of a program and arguments that were used to derive the file. Well-behaved generic netCDF filters (programs that take netCDF files as input and produce netCDF files as output) will automatically append their name and the parameters with which they were invoked to the global history attribute of an input netCDF file. `Conventions' If present, `Conventions' is a global attribute that is a character array for the name of the conventions followed by the file, in the form of a string that is interpreted as a directory name relative to a directory that is a repository of documents describing sets of discipline-specific conventions. This permits a hierarchical structure for conventions and provides a place where descriptions and examples of the conventions may be maintained by the defining institutions and groups. The conventions path name is currently interpreted relative to the directory `pub/netcdf/Conventions/' on the host machine `ftp.unidata.ucar.edu'. For example, if a group named NUWG agrees upon a set of conventions for dimension names, variable names, required attributes, and netCDF representations for certain discipline-specific data structures, they may store a document describing the agreed-upon conventions in a file in the `NUWG/' subdirectory of the Conventions directory, and files that followed these conventions would contain a global `Conventions' attribute with value `"NUWG"'. Later, if the group agrees upon some additional conventions for a specific subset of NUWG data, for example time series data, the description of the additional conventions might be stored in the `NUWG/Time_series/' subdirectory, and files that adhered to these additional conventions would use the global `Conventions' attribute with value `"NUWG/Time_series"', implying that this file adheres to the NUWG conventions and also to the additional NUWG time-series conventions. Units conventions ================= At Unidata, we have developed a units library to convert between formatted and binary forms of units specifications and perform unit algebra on the binary form. A compressed tar file for the library is available from the file `pub/netcdf/udunits.tar.Z' in the anonymous FTP directory of `ftp.unidata.ucar.edu'. The following are examples of units strings that can be interpreted by the `utScan()' function in the Unidata units library: 10 kilogram.meters/seconds2 10 kg-m/sec2 10 kg m/s^2 10 kilogram meter second-2 (PI radian)2 degF 100rpm geopotential meters 33 feet water milliseconds since 1992-12-31 12:34:0.1 -7:00 A unit is specified as an arbitrary product of constants and unit-names raised to arbitrary integral powers. Division is indicated by a slash `/'. Multiplication is indicated by whitespace, a period `.', or a hyphen `-'. Exponentiation is indicated by an integer suffix or by the exponentiation operators `^' and `**'. Parentheses may be used for grouping and disambiguation. The timestamp in the last example is handled as a special case. Arbitrary Galilean transformations (i.e. *y = ax + b*) are allowed. In particular, temperature conversions are correctly handled. The specification: degF @ 32 indicates a Fahrenheit scale with the origin shifted to thirty-two degrees Fahrenheit (i.e. to zero Celsius). Thus, the Celsius scale is equivalent to the following unit: 1.8 degF @ 32 Note that the origin-shift operation takes precedence over multiplication. In order of increasing precedence, the operations are division, multiplication, origin-shift, and exponentiation. `utScan()' understands all the SI prefixes (e.g. "mega" and "milli") plus their abbreviations (e.g. "M" and "m") The function utPrint() always encodes a unit specification one way. To reduce misunderstandings, it is recommended that this encoding style be used as the default. In general, a unit is encoded in terms of basic units, factors, and exponents. Basic units are separated by spaces, and any exponent directly appends its associated unit. The above examples would be encoded as follows: 10 kilogram meter second-2 9.8696044 radian2 0.555556 kelvin @ 255.372 10.471976 radian second-1 9.80665 meter2 second-2 98636.5 kilogram meter-1 second-2 0.001 seconds since 1992-12-31 19:34:0.1000 UTC (Note that the Fahrenheit unit is encoded as a deviation, in fractional kelvins, from an origin at 255.372 kelvin, and that the time in the last example has been referenced to UTC.) The database for the units library is a formatted file containing unit definitions and is used to initialize this package. It is the first place to look to discover the set of valid names and symbols. The format for the units-file is documented internally and the file may be modified by the user as necessary. In particular, additional units and constants may be easily added (including variant spellings of existing units or constants). `utScan()' is case-sensitive. If this causes difficulties, you might try making appropriate additional entries to the units-file. Some unit abbreviations in the default units-file might seem counter-intuitive. In particular, note the following: For Use Not Which Instead Means Celsius `Celsius' `C' coulomb gram `gram' `g' gallon `gallon' `gal' radian `radian' `rad' Newton `newton' or `N' `nt' nit (unit of photometry) If there is a choice of which units to use when writing a netCDF file, we recommend using full names of SI units, where applicable. If a code table must be referenced, e.g. WMO table 4677 for weather, either provide a reference to the code table as the value of the units attribute (`WX:units = "WMO table 4677"'. Alternatively, if the table is short, you can include it in the file as a variable and reference the variable for the units. A third possibility is to refer to a header file that contains the code table, e.g. `config:units = "ennumerated in rarep.h"'. The handling of time: ===================== The handling of time can be tricky -- especially if one wants generic programs to handle time as just another dimension. This section details the Unidata Program Center's current thinking on the use of time in netCDF databases and programs. The UPC has enhanced the udunits(3) library so that it understands temporal unit specification like the following: variables: double time(nobs); time:units = "milliseconds since 1992-9-16 10:09:55.3 -600" The `-600' in the above specifies the time zone that is six hours earlier than UTC on the given date (i.e. Mountain Daylight Time); this is a common UNIX and Internet RFC convention. You might recall that the udunits(3) library already had the concept of origin; thus, the above is just a special syntax for specifying a temporal origin. This enhancement of the udunits(3) library allows unit- manipulating, generic netCDF programs to be completely unaware of the temporal nature of any variable -- unless, of course, they wanted to know (such as in plotting a time axis, for example). The primary objections to this scheme, as discussed on the netcdf mailing-list, were concerns about accuracy, resolution, and range. Fortunately, the udunits(3) library contains sufficient flexibility to satisfy these requirements through the judicious choice of units. For example, a time-series of temperature measurements taken once every 20 minutes can have the following: double time(nobs); time:units = "20minutes since 1992-9-16 10:00 -600" This would cause the double precision `time' variable to contain integral values, each of which would represent the number of 20 minutes intervals since the beginning of the data. If you recall that floating-point variables represent integral values exactly on all platforms, and that double-precision netCDF values have at least 10 decimal digits of precision, then you should see how the above example can exactly represent 10^10 observations, or approximately 380,000 years worth of data. Coordinate Systems ================== Conventions for representing coordinate systems and transformed coordinate systems, e.g., satellite navigations and map projections are under design, and will eventually be part of a Unidata `udgeoref' library. For discussion of a preliminary design, see `Fulker, D. W., Unidata Strawman for Storing Earth-Referencing Data, Seventh International Conference on Interactive Information and Processing Systems for Meteorology, Oceanography, and Hydrology, January 1991'. Conventions for representing trajectories and irregularly positioned soundings so as to distinguish them from one another and from more regular data arrangements, such as grids and images, are also discussed in the above referenced paper.