UNIDATA/UCAR

the THREDDS project


THREDDS: The Service model

Web services technology for data discovery, access, process and visualisation

 

 

 

Stefano Nativi

 

 

 

Boulder, August 2002

 

 


rationale

why should we do that?

ü      If you are not interoperable you are out of the market (or you are Microsoft);

ü      In ICT everything changes very fast, you must be ready to change at the same pace without redoing everything;

ü      The most important data source is always that which is not supported from your system…..

ü      Users often don’t know what they want, but they certainly know how to extend your system….

ü      To decouple means to maintain better (“dividi et impera”);

ü      People ask for standards (so they can decide  not to follow them);

ü      The accessibility issue is not only important for disabled people but also for disabled software….

ü     

Very soon Users will ask you to access your software via wireless terminals… do not panic…

ü      Web is ubiquitous, so will be web services;

 

 

 

 


The Service Matrix

One of  the main THREDDS objects is to serve heterogeneous Communities in order to discover and access UNIDATA/UCAR legacy datasets.

Among such Communities it is possible to distinguish:

1.       the Digital Library Community;

2.       the GIS-enabled Community;

3.       the Research and Educational Community;

4.       the Decision Makers Community.

Another important THREDDS’s objective is to provide these Communities’ application with datasets along with the metadata they need.

 

By extending such objectives in a general framework, it is possible to introduce the following matrix of Services and Communities’ applications; it suggests which services the diverse applications are mainly interested in.

 

Service

category

Comm.ty

application

Visualisatn

services

metadata

abstraction

services

catalogue

services

data/

protocol

processing

services

data

access

services

Educational & Research (SINOTS)

X

 

X

X

X

Digital Library (DLESE)

 

X

X

X

X

Decision Maker (Flash Flood)

 

 

 

X

X

Any Inexpert Web-based application

X

 

X

X

X

GIS-based application

 

X

X

X

X

 

Actually, there is at least another important category of services which is not reported: the Security services; they are useful to both requesters and providers of services.

 

The Web Service Approach

One of the possible approach to deliver the services reported in the previous matrix, is to utilise Web Services technology.

The web service approach is very simple, as depicted in the following schema:

The service requester are heterogeneous software applications (e.g. the software applications of the heterogeneous THREDDS’s User Community) which access the “Discovery & Access services” for discovering and accessing the services provided by one or more service provider.

A service provider may access datasets stored in local servers or external systems or become itself a server requester, acting as a service broker.

 

The THREDDS SERVICES ARCHITECTURE

According to the previous general architecture and the introduced matrix of services, the THREDDS service architecture can be conceived as depicted in the following schema.

The pale blue services are the existing services.

 



Service Discovery & access services

THREDDS Catalogue

The TRHEDDS Catalogue service implements both  “Service Discovery & Access” and “Data Catalogue” functionalities.

Client applications may utilise the TRHEDDS Catalogue service to discover the existence of UNIDATA/UCAR WCS (i.e. service discovery) and/or to get catalogues of accessible UNIDATA/UCAR datasets (i.e. catalogue services).

UDDI registry

In order to allow any web-based application to discover and access UNIDATA/UCAR public web services –even clients which don’t know how to interact with THREDDS Catalogue- an UDDI registry could be implemented: it implements the most popular protocol used to discover and access web services.

 

data catalogue services

THREDDS Catalogue

The TRHEDDS Catalogue service implements both  “Service Discovery & Access” functionalities and “Data Catalogue” ones.

Clients may utilise the TRHEDDS Catalogue service to discover the existence of UNIDATA/UCAR WCS (i.e. service discovery) and to get catalogues of accessible UNIDATA/UCAR datasets (i.e. catalogue services).

ISO Catalogue Service

In order to allow any client application, which is compliant to ISO&OpenGIS Catalogue interface specs, to access UNIDATA/UCAR datasets it could be possible to realise and deploy a ISO Catalogue service  interface.

Gazetteers service

The gazetteers service is in charge of mapping a location name (e.g. “Boulder”, “Italy”, “America”, etc.) into a coverage request to be submitted to the WCS. It is particularly useful for DL client applications.

Such service is generally based on a dictionary service; the dictionary service can be internally implemented or an extern dictionary service can be utilised, accessing it through the Web.

 

data visualisation services

IDV services

It is possible to envisage that the IDV visualiser could be deployed as a set of high-level web service; therefore they can be discovered and accessed as any other web-based services.

Such solution would be particularly useful for other UNIDATA/UCAR services which could leverage IDV visualisation services, avoiding to rewrite the same functionalities.

Compared to the present approach of IDV utilisation –that is: a client install the IDV application and then utilises it to visualise and navigate datasets- the proposed web service approach is: a client application discovers a dataset -or a set of datasets- through the UNIDATA/UCAR service providers, and then asks the same service providers to visualise them.

The visualisation can be as simple as receiving a GIF image or as complex as receiving a WebStart-enabled application.

WMS

Another useful service to visualise datasets can be implemented realising the OpenGIS WMS (Web Map Service) specifications. As a matter of fact, such service returns a pictorial image (i.e. GIF, PNG , JPG) representing the combination of a set of previously selected datasets.

 

Data access services

THREDDS WCS

As far as web services, the WCS is the most useful way to access UNIDATA/UCAR datasets. Coverages are 1, 2, 3, 4 and 5 Dimensions datasets.

A client application can request the WCS and obtain a description of all the Coverage datasets managed by the service; then it can select a dataset, filtering it in space, time and field dimensions and eventually get it. Naturally, the client can get as many dataset as it likes (but only one by one).

The WCS supports the following formats for returned datasets: NcML, NetCDF, and GeoTIFF. Semantically speaking the first is the most complete and the last is the least.

It is possible to envisage that UNIDATA/UCAR will implement more than one WCS; they should be all listed and advertised through the “UNIDATA/UCAR service Discovery & Access” services (i.e. THREDDS Catalogue or UDDI registry service).

THREDDS WFS

In the case UNIDATA/UCAR needs to allow the access to feature-based datasets (i.e. datasets characterised by geometric elements: that is typical of GIS-based datasets), a good open solution could be to realise a WFS (Web Feature Service) interface. It acts like a WCS interface but naturally returns different metadata and data formats.

RT Source Connection Service

The Real Time Source Connection service allows a client application which needs near real-time dataset access to subscribe and receive such data through web technologies (i.e. POST/HTTP, SOAP/HTTP, SOAP/FTP, etc.). The metadata accessed by the client are encoded in XML, the data can be encoded in legacy format or in XML.

An example of client application is a flash flood early warning system which accesses near real time datasets to provide information useful to support decision makers.

 

metadata abstraction service

Ontology Map service

The Ontology Map service allows requester to get the ontology map of data and metadata served by the UNIDATA/UCAR service providers.

That is important for client applications which don’t know the semantics of data and metadata accessed through the other services.

Such service provides an abstraction of the metadata provided by the UNIDATA/UCAR services or –if you like- meta-meta-data.

DLESE VS NcML metadata abstraction service

DLESE client applications need to access metadata with an abstraction level which is higher than normal geo-physics dataset one. Therefore, this service (which is mainly used by other UNIDATA/UCAR services, but can also be accessed from the extern) is in charge of abstracting the NcML metadata content generating the right abstraction level of dataset metadata.

Such task is not easy, therefore we must consider to start from a NcML encoding (which is supposed to contain lots of metadata).

GIS VS NcML metadata abstraction service

GIS client applications need to access metadata characterized by given facets (e.g. in accordance to OpenGIS or ISO models), therefore this service (which is mainly used by other UNIDATA/UCAR services, but can also be accessed from the extern) is in charge of encoding the right metadata facets starting from NcML metadata content.

It can be seen as an extension of the core NcML specification.

 

Data processing service

NetCDF to NcML service

It is one of the most important service to be realized. Such service must accomplish the following tasks:

a)       To encode NetCDF objects in XML;

b)      To encode implicit NetCDF semantics into explicit XML tags;

c)       To enrich NetCDF metadata with other metadata (i.e. to extend the present NetCDF metadata content).

Such service is mainly used by data servers to process NetCDF datasets and generate more interoperable NcML datasets; naturally, it is also possible to access such service from the extern (as any other web service).

It is useful to underline that a NcML dataset is likely to consist of a set of metadata describing a NetCDF dataset, which is explicitly referred.

WCS/DODS DAP Gateway

It implements a gateway from the WCS protocol to the DODS DAP protocol. It is useful to access the DODS system data content from the Web.

Such a service seems to be already funded in the framework of the DODS activities, and SOAP seems to be the technology chosen for implementing it.

WCS/LDM-IDD Gateway

It implements a gateway from the WCS protocol to the LDM-IDD system. It is useful to allow Web-based applications to subscribe and receive near real-time datasets through the LDM-IDD system.

Such gateway could dramatically facilitate the development of several important systems for helping decision makers to face hazardous situations (e.g. fire, flash flood, chemical dispersion, etc.)


User scenarios

 

The following user scenarios show how the previous services could be utilised to address UNIDATA/UCAR Client Community needs.

 

Geo-physics, Educational and Research Client applications

This user scenario considers the case of a client application which is:

a)       Aware of the existence of a THREDDS catalogue service;

b)      Able to understand and process NetCDF and/or NcML data and metadata;

c)       A web-based application;

d)      Able to visualize accessed datasets.

 

Scenario:

a)       A Client application requests the THREDDS catalogue service to discover and access available UNIDATA/UCAR WCSs; the interaction occurs through HTML/HTTP;

b)      The Client application requests one or more UNIDATA/UCAR WCS receiving their capabilities; the interaction occurs through SOAP/HTTP or POST/HTTP;

c)       The Client application selects useful sub-datasets (i.e. filtering datasets in space time and field dimensions); such task plays on the client station;

d)      The Client application requests the selected sub-datasets to WCS; the interaction occurs through SOAP/HTTP or POST/HTTP;

e)       The WCS gets the requested sub-datasets from the available data servers (i.e. NetcDF or NcML data servers) or from DODS system by means of a WCS/DODS-DAP gateway.

f)        The WCS returns the requested sub-dataset; the interaction occurs through SOAP/HTTP or POST/HTTP; the returned dataset format can be: NetCDF, NcML or GeoTIFF;

 



 

real-time and decision maker support applications

This user scenario considers the case of a client application which is:

a)       Aware of the existence of a web service for accessing datasets in near real-time way (i.e. the RT Source Connection service);

b)      Able to understand and process NetCDF and/or NcML and/or GeoTIFF data and metadata;

c)       A web-based application;

d)      Able to visualize accessed datasets.

 

Scenario:

a)       A Client application directly requests the “RT Source Connection” service to subscribe a particular type of dataset for being dispatched in near real-time to the application itself (or any other URI); the interaction occurs through SOAP/HTTP or POST/HTTP;

b)      The “RT Source Connection” service redirects the request to the UNIDATA/UCAR LDM-IDD system by means of a “WCS/LDM-IDD gateway”. It acts like a proxy of the LDM-IDD system; the interaction occurs through SOAP/HTTP or POST/HTTP;

c)       A soon as the “RT Source Connection” service receives the requested information from the LDM-IDD system, it routes it to the client application (which subscribed for it); the interaction occurs through SOAP/HTTP or POST/HTTP; the returned dataset format can be: NetCDF, NcML or GeoTIFF;

 

 

 



 

Digital library applications

This user scenario considers the case of a client application which is:

e)       Aware of the existence of a THREDDS catalogue service (or an ISO standard catalogue service);

f)        Able to understand and process GeoTIFF and/or NcML data and metadata;

g)       A web-based application;

h)       Able to visualize accessed datasets.

 

Scenario:

a)       A Client application requests the THREDDS catalogue service (or an ISO standard catalogue service, or a Gazetteers service ) to discover and access available WCSs at UNIDATA/UCAR (actually the gazetteers service acts in a smarter way and avoids this stage); the interaction occurs through HTML/HTTP;

b)      The Client application (or the Gazetteers service) requests one -or more- UNIDATA/UCAR WCS getting its capabilities; the interaction occurs through SOAP/HTTP or POST/HTTP;

c)       The Client application (or the Gazetteers service) selects the useful sub-datasets (i.e. filtering datasets in space time and field dimensions); such task plays on the client station (or is automatically performed by the Gazetteers service);

d)      The Client application (or the Gazetteers service) requests the sub-datasets to WCS; the interaction occurs through SOAP/HTTP or POST/HTTP;

e)       The WCS gets the requested sub-datasets from the available data servers (i.e. NetcDF or NcML data servers) or from DODS system by means of a WCS/DODS-DAP gateway.

f)        The WCS requests the “DLESE Vs NCML Metadata abstraction” service to abstract the NcML metadata; that service receives a NcML dataset and returns an extended NcML dataset;

g)       The WCS (or the Gazetteers service) returns the requested (or automatically individuated) sub-dataset; the interaction occurs through SOAP/HTTP or POST/HTTP; the returned dataset format may be: either in NcML (with the right level of metadata abstraction) or simply GeoTIFF;

 

 

 



 

GIS-based applications

This user scenario considers the case of a client application which is:

a)       Not able to interact with UNIDATA/UCAR THREDDS Catalogue service, but is able to interact with OpenGIS WCS services (or WFS service);

b)      Not able to decode NcML or NetCDF formats, but is able to understand and process GeoTIFF data and metadata;

c)       A web-based application;

d)      Able to visualize accessed datasets.

 

Scenario:

a)       A Client application requests one or more UNIDATA/UCAR WCS getting their capabilities; the interaction occurs through SOAP/HTTP or POST/HTTP;

b)      The Client application selects the useful sub-datasets (i.e. filtering datasets in space time and field dimensions); such task plays on the client station;

c)       The Client application requests the sub-datasets to the WCS; the interaction occurs through SOAP/HTTP or POST/HTTP;

d)      The WCS gets the requested sub-datasets from the available data servers (i.e. NetcDF or NcML data servers) or from DODS system by means of a WCS/DODS-DAP gateway.

e)       The WCS requests the “GIS Vs NCML Metadata abstraction” service to generate the GIS facets from the obtained NcML metadata; that service receives a NcML dataset and returns an extended NcML dataset;

f)        The WCS returns the requested sub-dataset; the interaction occurs through SOAP/HTTP or POST/HTTP; the returned dataset format is GeoTIFF.

 

 



 

unexpert web-based applications

This user scenario considers the case of a client application which is:

a)       Not able to interact with the UNIDATA/UCAR THREDDS Catalogue service, but is able to interact with a UDDI registry and a OpenGIS WCS service (or a WMS service);

b)      Not able to decode NcML, NetCDF, or GeoTIFF formats;

c)       Not able to visualize accessed datasets;

d)      A web-based application;

 

Scenario:

a)       A Client application requests the UNIDATA/UCAR UDDI registry to discover and access available services: it discovers three interesting services:
1 - the WCS service to get coverages
2 - the WMS services to get maps;
3 -the Ontology map services to show ontologies of the data and metadata managed by the UNIDATA/UCAR;

the interaction occurs trough UDDI/http;

b)      The Client requests the Ontology Map services to get data and metadata semantics; the interaction is through POST/HTTP or SOAP/http;

c)       The Ontology Map services provider generates the ontology maps and asks the IDV service provider to return them in either a GIF or an IDV format. The first choice utilises an HTML/http interaction, the second one uses the Sun WebStart technology.

d)      The Client application requests one or more UNIDATA/UCAR WCS (or WMS) getting their capabilities; the interaction occurs through SOAP/HTTP or POST/HTTP;

e)       The Client application selects the useful sub-datasets (i.e. filtering datasets in space time and field dimensions); such task plays on the client station;

f)        The Client application requests the selected sub-datasets to the WCS (or WMS); the interaction occurs through SOAP/HTTP or POST/HTTP;

g)       The WCS gets the requested sub-datasets from the available data servers (i.e. NetcDF or NcML data servers) or from DODS system by means of a WCS/DODS-DAP gateway.

h)       The WCS (not the WMS) asks the IDV service provider to return datasets in either a GIF or an IDV format. The first choice utilizes an HTML/http connection, the second one uses the Sun WebStart technology.

 



The thredds WCS

 

In the framework of THREDDS a WCS interface to UNIDATA/UCAR legacy systems has been implemented.

In particular, the WCS version 7.0 has been implemented.

 

Information View

Three main data models have been considered:

1)       The OGC Image and Gridded Coverage data model;

2)       The NcML/NetCDF data model;

3)       The Graphical Client Data Model (the THREDDS Data Viewer)

 

The OGC Image and Gridded Coverage data model

The following figure depicts the OGC/ISO conceptual schema of coverage

 

Coverage maps from a Spatio-temporal Domain to Feature attribute values.

Coverage is defined as:

feature that acts as a function to return one or more feature attribute values for any direct position within its spatiotemporal domain

Examples include a raster image, polygon overlay, or digital elevation matrix.

A Spatio-temporal Domain consists of a collection of direct positions in a coordinate space.

Continuous Coverage is a Coverage that returns different values for the same feature attribute at different direct positions within a single Geometric Object in its Spatio-temporal Domain

Discrete Coverage is a Coverage that returns the same feature attribute values for every direct position within any single Geometric Object in its Spatio-temporal Domain.

Geometric Object is a spatial object representing a set of direct positions.

A grid coverage is a specific case of coverage in which a set of grid values covers the surface. Examples of a grid coverage are satellite images, digital elevation models, and digital orthophotos.

 

The present version of WCS supports “simple” coverages: regular, rectangular grid or tasselation space.

 

 

The Graphical Client data model

TDV data model..

 

 

Data and Metadata Encoding

The coverage data and metadata, and the service metadata are encoded in the following formats:

a)       service Metadata:  XML according to the WCS specifications;

b)      coverage Metadata: NetCDF/NcML;

c)       coverage Data:   NetCDF/GeoTIFF.

 

Computational view

 

Architecture

 


 


Engineering view

Architecture

The implemented solution utilises the SOAP/XML technology for implementing the WCS interface gateway.

 

 

 

 

 
 

 

 

 

 

 

 

 

 


The Graphical Client is a SOAP-enabled version of TDV.

The implemented solution utilises SOAP over an HTTP POST. The SOAP message must contain the XML-encoded WCS request. The XML-encoded request is conform to the schema corresponding to the chosen operation (i.e. GetCapabilities or GetCoverage).

Upon receiving a valid request, the service sends a response corresponding exactly to the request as detailed in the WCS specification. Only in the case of Version Negotiation (described in the WCS specification) the server may offer a differing result.

Upon receiving an invalid request, the service issues a Service Exception as described in the WCS specification.

 

Technology View

The following technology base-line is going to be utilised

Development Environment

 

Development Language

JAVA SDK 1.4

Toolkits or libraries utilised

Castor, ElectricGlue

Deployment Environment

MS Microsoft