|
UNIDATA/UCAR the THREDDS project |
THREDDS:
The Service model
Web services technology for data discovery, access, process and
visualisation
Stefano Nativi
Boulder, August 2002
rationale
why
should we do that?
ü
If you
are not interoperable you are out of the market (or you are Microsoft);
ü
In ICT
everything changes very fast, you must be ready to change at the same pace
without redoing everything;
ü
The
most important data source is always that which is not supported from your
system…..
ü
Users
often don’t know what they want, but they certainly know how to extend your
system….
ü
To
decouple means to maintain better (“dividi et impera”);
ü
People
ask for standards (so they can decide
not to follow them);
ü
The
accessibility issue is not only important for disabled people but also for
disabled software….
ü

Very soon Users will ask you
to access your software via wireless terminals… do not panic…
ü
Web is
ubiquitous, so will be web services;
One
of the main THREDDS objects is to serve
heterogeneous Communities in order to discover and access UNIDATA/UCAR
legacy datasets.
Among
such Communities it is possible to distinguish:
1.
the
Digital Library Community;
2.
the
GIS-enabled Community;
3.
the
Research and Educational Community;
4.
the
Decision Makers Community.
Another
important THREDDS’s objective is to provide these Communities’ application with
datasets along with the metadata they need.
By
extending such objectives in a general framework, it is possible to introduce
the following matrix of Services
and Communities’ applications; it suggests which services the
diverse applications are mainly interested in.
|
Service category Comm.ty application |
Visualisatn services |
metadata abstraction services |
catalogue services |
data/ protocol processing services |
data access services |
|
Educational & Research
(SINOTS) |
X |
|
X |
X |
X |
|
Digital Library (DLESE) |
|
X |
X |
X |
X |
|
Decision Maker (Flash Flood) |
|
|
|
X |
X |
|
Any Inexpert Web-based application |
X |
|
X |
X |
X |
|
GIS-based application |
|
X |
X |
X |
X |
Actually,
there is at least another important category of services which is not reported:
the Security services;
they are useful to both requesters and providers of services.
One of
the possible approach to deliver the services reported in the previous matrix,
is to utilise Web Services
technology.
The web
service approach is very simple, as depicted in the following schema:

The
service requester are heterogeneous software applications (e.g. the software
applications of the heterogeneous THREDDS’s User Community) which access the
“Discovery & Access services” for discovering and accessing the services
provided by one or more service provider.
A service
provider may access datasets stored in local servers or external systems or
become itself a server requester, acting as a service broker.
According
to the previous general architecture and the introduced matrix of services, the
THREDDS service architecture can be conceived as depicted in the following
schema.
The pale blue services are the existing services.

The
TRHEDDS Catalogue service implements both
“Service Discovery & Access”
and “Data Catalogue” functionalities.
Client
applications may utilise the TRHEDDS Catalogue service to discover the
existence of UNIDATA/UCAR WCS (i.e. service discovery) and/or to get catalogues
of accessible UNIDATA/UCAR datasets (i.e. catalogue services).
In order
to allow any web-based application to discover and access UNIDATA/UCAR public
web services –even clients which don’t know how to interact with THREDDS
Catalogue- an UDDI registry could be implemented: it implements the most
popular protocol used to discover and access web services.
The
TRHEDDS Catalogue service implements both
“Service Discovery & Access”
functionalities and “Data Catalogue” ones.
Clients
may utilise the TRHEDDS Catalogue service to discover the existence of
UNIDATA/UCAR WCS (i.e. service discovery) and to get catalogues of accessible
UNIDATA/UCAR datasets (i.e. catalogue services).
In order
to allow any client application, which is compliant to ISO&OpenGIS
Catalogue interface specs, to access UNIDATA/UCAR datasets it could be possible
to realise and deploy a ISO Catalogue service
interface.
The
gazetteers service is in charge of mapping a location name (e.g. “Boulder”,
“Italy”, “America”, etc.) into a coverage request to be submitted to the WCS.
It is particularly useful for DL client applications.
Such
service is generally based on a dictionary service; the dictionary service can be
internally implemented or an extern dictionary service can be utilised,
accessing it through the Web.
It is
possible to envisage that the IDV
visualiser could be deployed as a set of high-level web service;
therefore they can be discovered and accessed as any other web-based services.
Such
solution would be particularly useful for other UNIDATA/UCAR services which
could leverage IDV visualisation services, avoiding to rewrite the same
functionalities.
Compared
to the present approach of IDV utilisation –that is: a client install the IDV
application and then utilises it to visualise and navigate datasets- the
proposed web service approach is: a client application discovers a dataset -or
a set of datasets- through the UNIDATA/UCAR service providers, and then asks
the same service providers to visualise them.
The
visualisation can be as simple as receiving a GIF image or as complex as
receiving a WebStart-enabled application.
Another
useful service to visualise datasets can be implemented realising the OpenGIS
WMS (Web Map Service) specifications. As a matter of fact, such service returns
a pictorial image (i.e. GIF, PNG , JPG) representing the combination of a set
of previously selected datasets.
As far as
web services, the WCS is the most useful way to access UNIDATA/UCAR datasets.
Coverages are 1, 2, 3, 4 and 5 Dimensions datasets.
A client
application can request the WCS and obtain a description of all the Coverage
datasets managed by the service; then it can select a dataset, filtering it in
space, time and field dimensions and eventually get it. Naturally, the client
can get as many dataset as it likes (but only one by one).
The WCS
supports the following formats for returned datasets: NcML, NetCDF, and
GeoTIFF. Semantically speaking the first is the most complete and the last is
the least.
It is
possible to envisage that UNIDATA/UCAR will implement more than one WCS; they
should be all listed and advertised through the “UNIDATA/UCAR service Discovery
& Access” services (i.e. THREDDS Catalogue or UDDI registry service).
In the case
UNIDATA/UCAR needs to allow the access to feature-based datasets (i.e. datasets
characterised by geometric elements: that is typical of GIS-based datasets), a
good open solution could be to realise a WFS (Web Feature Service) interface.
It acts like a WCS interface but naturally returns different metadata and data
formats.
The Real
Time Source Connection service allows a client application which needs near
real-time dataset access to subscribe and receive such data through web technologies
(i.e. POST/HTTP, SOAP/HTTP, SOAP/FTP, etc.). The metadata accessed by the
client are encoded in XML, the data can be encoded in legacy format or in XML.
An
example of client application is a flash flood early warning system which
accesses near real time datasets to provide information useful to support
decision makers.
The
Ontology Map service allows requester to get the ontology map of data and metadata
served by the UNIDATA/UCAR service providers.
That is
important for client applications which don’t know the semantics
of data and metadata accessed
through the other services.
Such
service provides an abstraction of the metadata provided by the UNIDATA/UCAR
services or –if you like- meta-meta-data.
DLESE
client applications need to access metadata with an abstraction level which is higher than normal
geo-physics dataset one. Therefore, this service (which is
mainly used by other UNIDATA/UCAR services, but can also be accessed from the
extern) is in charge of abstracting the NcML metadata content generating the
right abstraction level of dataset metadata.
Such task
is not easy, therefore we must consider to start from a NcML encoding (which is
supposed to contain lots of metadata).
GIS
client applications need to access metadata characterized by
given facets (e.g. in
accordance to OpenGIS or ISO models), therefore this service (which is mainly
used by other UNIDATA/UCAR services, but can also be accessed from the extern)
is in charge of encoding the right metadata facets starting from NcML metadata
content.
It can be
seen as an extension of the core NcML specification.
It is one of the most
important service to be realized. Such service must accomplish the following tasks:
a)
To
encode NetCDF objects in XML;
b)
To
encode implicit NetCDF semantics into explicit XML tags;
c)
To
enrich NetCDF metadata with other metadata (i.e. to extend the present NetCDF
metadata content).
Such
service is mainly used by data servers to process NetCDF datasets and generate
more interoperable NcML datasets; naturally, it is also possible to access such
service from the extern (as any other web service).
It is
useful to underline that a NcML dataset is likely to consist of a set of
metadata describing a NetCDF dataset, which is explicitly referred.
It
implements a gateway from the WCS protocol to the DODS DAP protocol. It is useful
to access the DODS system data content from the Web.
Such a service seems to be already funded in the
framework of the DODS activities, and SOAP seems to be the technology chosen for implementing it.
It
implements a gateway from the WCS protocol to the LDM-IDD system. It is useful
to allow Web-based applications to subscribe and receive near real-time
datasets through the LDM-IDD system.
Such
gateway could dramatically facilitate the development of several important
systems for helping decision makers to face hazardous situations (e.g. fire,
flash flood, chemical dispersion, etc.)
User
scenarios
The
following user scenarios show how the previous services could be utilised to
address UNIDATA/UCAR Client Community needs.
This user
scenario considers the case of a client application which is:
a)
Aware
of the existence of a THREDDS catalogue service;
b)
Able
to understand and process NetCDF and/or NcML data and metadata;
c)
A
web-based application;
d)
Able
to visualize accessed datasets.
Scenario:
a)
A
Client application requests the THREDDS catalogue service to discover and
access available UNIDATA/UCAR WCSs; the interaction occurs through HTML/HTTP;
b)
The
Client application requests one or more UNIDATA/UCAR WCS receiving their
capabilities; the interaction occurs through SOAP/HTTP or POST/HTTP;
c)
The
Client application selects useful sub-datasets (i.e. filtering datasets in
space time and field dimensions); such task plays on the client station;
d)
The
Client application requests the selected sub-datasets to WCS; the interaction
occurs through SOAP/HTTP or POST/HTTP;
e)
The
WCS gets the requested sub-datasets from the available data servers (i.e.
NetcDF or NcML data servers) or from DODS system by means of a WCS/DODS-DAP
gateway.
f)
The
WCS returns the requested sub-dataset; the interaction occurs through SOAP/HTTP
or POST/HTTP; the returned dataset format can be: NetCDF, NcML or GeoTIFF;

This user
scenario considers the case of a client application which is:
a)
Aware
of the existence of a web service for accessing datasets in near real-time way
(i.e. the RT Source Connection service);
b)
Able
to understand and process NetCDF and/or NcML and/or GeoTIFF data and metadata;
c)
A
web-based application;
d)
Able
to visualize accessed datasets.
Scenario:
a)
A
Client application directly requests the “RT Source Connection” service to
subscribe a particular type of dataset for being dispatched in near real-time
to the application itself (or any other URI); the interaction occurs through
SOAP/HTTP or POST/HTTP;
b)
The
“RT Source Connection” service redirects the request to the UNIDATA/UCAR
LDM-IDD system by means of a “WCS/LDM-IDD gateway”. It acts like a proxy of the
LDM-IDD system; the interaction occurs through SOAP/HTTP or POST/HTTP;
c)
A soon
as the “RT Source Connection” service receives the requested information from
the LDM-IDD system, it routes it to the client application (which subscribed
for it); the interaction occurs through SOAP/HTTP or POST/HTTP; the returned
dataset format can be: NetCDF, NcML or GeoTIFF;

This user
scenario considers the case of a client application which is:
e)
Aware
of the existence of a THREDDS catalogue service (or an ISO standard catalogue
service);
f)
Able
to understand and process GeoTIFF and/or NcML data and metadata;
g)
A
web-based application;
h)
Able
to visualize accessed datasets.
Scenario:
a)
A
Client application requests the THREDDS catalogue service (or an ISO standard
catalogue service, or a Gazetteers service ) to discover and access available
WCSs at UNIDATA/UCAR (actually the gazetteers service acts in a smarter way and
avoids this stage); the interaction occurs through HTML/HTTP;
b)
The
Client application (or the Gazetteers service) requests one -or more-
UNIDATA/UCAR WCS getting its capabilities; the interaction occurs through
SOAP/HTTP or POST/HTTP;
c)
The
Client application (or the Gazetteers service) selects the useful sub-datasets
(i.e. filtering datasets in space time and field dimensions); such task plays
on the client station (or is automatically performed by the Gazetteers
service);
d)
The
Client application (or the Gazetteers service) requests the sub-datasets to WCS;
the interaction occurs through SOAP/HTTP or POST/HTTP;
e)
The
WCS gets the requested sub-datasets from the available data servers (i.e.
NetcDF or NcML data servers) or from DODS system by means of a WCS/DODS-DAP
gateway.
f)
The
WCS requests the “DLESE Vs NCML Metadata abstraction” service to abstract the
NcML metadata; that service receives a NcML dataset and returns an extended
NcML dataset;
g)
The
WCS (or the Gazetteers service) returns the requested (or automatically
individuated) sub-dataset; the interaction occurs through SOAP/HTTP or
POST/HTTP; the returned dataset format may be: either in NcML (with the right
level of metadata abstraction) or simply GeoTIFF;

This user
scenario considers the case of a client application which is:
a)
Not
able to interact with UNIDATA/UCAR THREDDS Catalogue service, but is able to
interact with OpenGIS WCS services (or WFS service);
b)
Not
able to decode NcML or NetCDF formats, but is able to understand and process
GeoTIFF data and metadata;
c)
A
web-based application;
d)
Able
to visualize accessed datasets.
Scenario:
a)
A
Client application requests one or more UNIDATA/UCAR WCS getting their
capabilities; the interaction occurs through SOAP/HTTP or POST/HTTP;
b)
The
Client application selects the useful sub-datasets (i.e. filtering datasets in
space time and field dimensions); such task plays on the client station;
c)
The
Client application requests the sub-datasets to the WCS; the interaction occurs
through SOAP/HTTP or POST/HTTP;
d)
The
WCS gets the requested sub-datasets from the available data servers (i.e.
NetcDF or NcML data servers) or from DODS system by means of a WCS/DODS-DAP
gateway.
e)
The
WCS requests the “GIS Vs NCML Metadata abstraction” service to generate the GIS
facets from the obtained NcML metadata; that service receives a NcML dataset
and returns an extended NcML dataset;
f)
The
WCS returns the requested sub-dataset; the interaction occurs through SOAP/HTTP
or POST/HTTP; the returned dataset format is GeoTIFF.

This user
scenario considers the case of a client application which is:
a)
Not
able to interact with the UNIDATA/UCAR THREDDS Catalogue service, but is able
to interact with a UDDI registry and a OpenGIS WCS service (or a WMS service);
b)
Not
able to decode NcML, NetCDF, or GeoTIFF formats;
c)
Not
able to visualize accessed datasets;
d)
A
web-based application;
Scenario:
a)
A
Client application requests the UNIDATA/UCAR UDDI registry to discover and
access available services: it discovers three interesting services:
1 - the WCS service to get coverages
2 - the WMS services to get maps;
3 -the Ontology map services to show ontologies of the data and metadata
managed by the UNIDATA/UCAR;
the interaction occurs trough UDDI/http;
b)
The Client
requests the Ontology Map services to get data and metadata semantics; the
interaction is through POST/HTTP or SOAP/http;
c)
The
Ontology Map services provider generates the ontology maps and asks the IDV
service provider to return them in either a GIF or an IDV format. The first
choice utilises an HTML/http interaction, the second one uses the Sun WebStart
technology.
d)
The
Client application requests one or more UNIDATA/UCAR WCS (or WMS) getting their
capabilities; the interaction occurs through SOAP/HTTP or POST/HTTP;
e)
The
Client application selects the useful sub-datasets (i.e. filtering datasets in
space time and field dimensions); such task plays on the client station;
f)
The
Client application requests the selected sub-datasets to the WCS (or WMS); the
interaction occurs through SOAP/HTTP or POST/HTTP;
g)
The
WCS gets the requested sub-datasets from the available data servers (i.e.
NetcDF or NcML data servers) or from DODS system by means of a WCS/DODS-DAP
gateway.
h)
The
WCS (not the WMS) asks the IDV service provider to return datasets in either a
GIF or an IDV format. The first choice utilizes an HTML/http connection, the
second one uses the Sun WebStart technology.

The thredds
WCS
In the
framework of THREDDS a WCS
interface to UNIDATA/UCAR legacy systems has been implemented.
In
particular, the WCS version 7.0 has been implemented.
Three
main data models have been considered:
1)
The
OGC Image and Gridded Coverage data model;
2)
The
NcML/NetCDF data model;
3)
The
Graphical Client Data Model (the THREDDS Data Viewer)
The
following figure depicts the OGC/ISO conceptual schema of coverage

Coverage maps from a Spatio-temporal Domain to Feature
attribute values.
Coverage
is defined as:
feature that
acts as a function to return one or more feature attribute values for any
direct position within its spatiotemporal domain
Examples
include a raster image, polygon overlay, or digital elevation matrix.
A Spatio-temporal Domain consists of a collection of direct positions in a coordinate space.
Continuous Coverage is a Coverage that returns
different values for the same feature attribute at different direct positions
within a single Geometric Object in its Spatio-temporal Domain
Discrete Coverage is a Coverage that returns the same feature
attribute values for every direct position within any single Geometric Object
in its Spatio-temporal Domain.
Geometric Object is a spatial object representing a set of
direct positions.
A grid coverage
is a specific case of coverage in which a set of grid values covers the
surface. Examples of a grid coverage are satellite images, digital elevation
models, and digital orthophotos.
The present version of WCS supports “simple”
coverages: regular, rectangular grid or tasselation space.
TDV data model..
The
coverage data and metadata, and the service metadata are encoded in the
following formats:
a)
service Metadata: XML according
to the WCS specifications;
b)
coverage Metadata: NetCDF/NcML;
c)
coverage Data:
NetCDF/GeoTIFF.

The
implemented solution utilises the SOAP/XML technology for implementing the WCS
interface gateway.

The
Graphical Client is a SOAP-enabled version of TDV.
The
implemented solution utilises SOAP over an HTTP POST. The SOAP message must
contain the XML-encoded WCS request. The XML-encoded request is conform to the
schema corresponding to the chosen operation (i.e. GetCapabilities or
GetCoverage).
Upon
receiving a valid request, the service sends a response corresponding exactly
to the request as detailed in the WCS specification. Only in the case of
Version Negotiation (described in the WCS specification) the server may offer a
differing result.
Upon
receiving an invalid request, the service issues a Service Exception as
described in the WCS specification.
The
following technology base-line is going to be utilised
JAVA SDK
1.4
Castor,
ElectricGlue
MS
Microsoft