NetCDF Subset Service Reference


Please note that the interface described here is still a prototype, and subject to change.

Please send comments to the THREDDS email group (preferred) or to John Caron

Contents:

  1. Overview
  2. Subsetting Parameters (summary)
  3. Dataset Descriptions
  4. Reference
  5. REST Design

Overview

The NetCDF Subset Service is a web service for subsetting CDM scientific datasets. The subsetting is specified using earth coordinates, such as lat/lon bounding boxes and date ranges, rather than index ranges that refer to the underlying data arrays. The data arrays are subsetted but not resampled or reprojected, and preserve the resolution and accuracy of the original dataset.

A Dataset is described by a Dataset Description XML document, which describes the dataset in enough detail to enable a programmatic client to form valid data requests.

The NetCDF Subset Service may return netCDF binary files (using CF-1.0 when possible), XML, or ASCII, depending on the request and the dataset.

The NetCDF Subset Service uses HTTP GET with key-value pairs (KVP). The service interface tries to follow REST design, as well as Google/KML and W3C XML Schema Datatypes when applicable.

Currently both Grids and Station data can be used with this service. See:

 

Subsetting Parameters (summary)

A. Specify variables

The list of valid variables is available from the Dataset Description.

Examples:

Variable names with spaces or other illegal characters must be escaped.

B. Specify spatial extent

Latitude, longitude values are specified in decimal degrees north and east, respectively.

1. Specify lat/lon bounding box

Specify all of these parameters (order does not matter):

The bounding box has west as its west edge, includes all points going east until the east edge. Units must be degrees east, may be positive or negative, and will be taken modulo 360. Therefore, when crossing the dateline, the west edge may be greater than the east edge. Examples:

2. Specify lat/lon point

The requested point must lie within the dataset spatial range. For observations, the station closest to the requested point will be used. For grids, the grid cell containing the requested point will be used.

Examples:

3. Specify station list

This can only be used for station datasets. The list of valid stations is available from the Dataset Description. Station names with spaces or other illegal characters must be escaped.

Examples:

4. Specify horizontal stride (Grid Subsetting only)

You can optionally take only every nth point (both the x and y dimensions).

Example:

 

C. Specify time

Use one of the following methods:

1. Time range

Specify 2 of these 3 parameters (order does not matter):

The intersection of the requested time range with the dataset time range will be returned.

Examples:

2. Time point

The requested time point must lie within the dataset time range. The time slice/point closest to the requested time will be returned.

Examples:

D. Specify the return format

Specify the return format(s) that you want by using the accept parameter:

accept=mime_type[,mime-type][,mime-type]

The list of possible return formats varies depending on the dataset, and can be found in the Dataset Description Document. Your request specifies the list of acceptable types, if none are valid a 400 "Bad Request" HTTP status is returned. If you specify multiple mime-types, the server will choose one of them.

The server returns the actual return format in the Content-Type header, examples:

Query examples:

The possible mime-types and aliases:

Mime Type Synonyms
text/plain raw, ascii
application/xml xml
text/csv csv
text/html html
application/x-netcdf netcdf

The list of actual return formats depends on the dataset, and can be found in the Dataset Description Document.

 

E. Specify the vertical coordinate

You may specify a vertical coordinate. Example:

F. Adding Lat/Lon arrays to the file

If the grid is a lat/lon grid, the lat and lon coordinates will be automatically included (as 1D coordinate variables). When the grid is on a projection, the lat/lon information will not be included unless the query parameter addLatLon is present. In that case, the lat, lon coordinates will be calculated and included into the file (as 2D variables).

The 4 corners of the lat/lon bounding box are converted into projection coordinates, then the smallest rectangle including those 4 points is used.

 

Dataset Descriptions

Each dataset has an XML document called the Dataset Description Document. These are intended to perform the same function as OGC GetCapabilities or Atom Introspection, that is, provide clients with the necessary information to formulate a valid request and send it to the server. The content of these documents is still evolving.

Station Observation Dataset

A Station Observation Dataset is a collection of time series of observations at named locations called stations.

The dataset is described by a stationObsDataset document, which in turn points to the list of valid stations in a separate stationCollection document. The stationCollection document can be quite large, and caching on the client (eg using the If-Modified-Since header) is an important optimization.

Grid Dataset

A Grid Dataset is a collection of Grids which have horizontal (x,y) coordinates, and optional vertical and time coordinates. Grid data points next to each other in index space are next to each other in coordinate space.


Reference

W3C Time Duration

The lexical representation for duration is the[ISO 8601] extended format PnYn MnDTnH nMnS, where nY represents the number of years, nM the number of months, nD the number of days, 'T' is the date/time separator, nH the number of hours, nM the number of minutes and nS the number of seconds. The number of seconds can include decimal digits to arbitrary precision.

The values of the Year, Month, Day, Hour and Minutes components are not restricted but allow an arbitrary unsigned integer, i.e., an integer that conforms to the pattern [0-9]+.. Similarly, the value of the Seconds component allows an arbitrary unsigned decimal. Following [ISO 8601], at least one digit must follow the decimal point if it appears. That is, the value of the Seconds component must conform to the pattern [0-9]+(\.[0-9]+)?. Thus, the lexical representation of duration does not follow the alternative format of § 5.5.3.2.1 of [ISO 8601].

An optional preceding minus sign ('-') is allowed, to indicate a negative duration. If the sign is omitted a positive duration is indicated. See also ISO 8601 Date and Time Formats (§D).

For example, to indicate a duration of 1 year, 2 months, 3 days, 10 hours, and 30 minutes, one would write: P1Y2M3DT10H30M. One could also indicate a duration of minus 120 days as: -P120D.

Reduced precision and truncated representations of this format are allowed provided they conform to the following:

For example, P1347Y, P1347M and P1Y2MT2H are all allowed; P0Y1347M and P0Y1347M0D are allowed. P-1347M is not allowed although -P1347M is allowed. P1Y2MT is not allowed.

See XML Schema duration for full details.

W3C Dates

For our purposes, and ISO Date can be a dateTime or a date:

A dateTime has the form: '-'? yyyy '-' mm '-' dd 'T' hh ':' mm ':' ss ('.' s+)? (zzzzzz)?

where

For example, 2002-10-10T12:00:00-05:00 (noon on 10 October 2002, Central Daylight Savings Time as well as Eastern Standard Time in the U.S.) is 2002-10-10T17:00:00Z, five hours later than 2002-10-10T12:00:00Z.

A date is the same as a dateTime without the time part : '-'? yyyy '-' mm '-' dd zzzzzz?

See XML Schema dateTime and date for full details


REST Design

1. What are the resources/URIs?

The resources are THREDDS datasets. The resource URIs can be discovered in a THREDDS catalog, by looking for datasets that use the NetcdfSubset Service type. Generally these resource URLs look like:

http://servername:8080/thredds/ncss/{path/dataset}

http://servername:8080/thredds/ncss/grid/{path/dataset}

Typically the user wants a subset of the dataset.This is considered a view of a resource, rather than a separate resource:

http://servername:8080/thredds/ncss/{path/dataset}?{subset}

A desired representation of the resource is specified using the accept parameter. Again, different representations are not considered separate resources. Following the Accept HTTP header, accept takes a comma delimited list of mime-types (or aliases), but does not allow wild cards (*) or q parameters.

http://servername:8080/thredds/ncss/{path/dataset}?{subset}&accept={mime-type}

2. What's the format/representation?

The dataset itself has two representations:

Results of a subset request can be:

  1. The netCDF binary file will be encoded using CF conventions when possible, and when not possible, the encoding will be submitted to CF for approval.
  2. The XML, ASCII, and CSV files are intended for use only for small extractions of data, and are generally missing some or all of the metadata of the dataset.
  3. Multiple accept values can be specified, eg accept=xml,csv (comma delimited, no spaces). The server will select from that list.

Representation types

Mime Type Synonyms
text/plain raw, ascii
application/xml xml
text/csv csv
text/html html
application/x-netcdf netcdf

 

3. What are the Methods?

Only the GET method is allowed.

4. What Status codes can be returned?

REST Resources:


This document is maintained by John Caron and was last updated on March, 2012