Re: [wcsplus] Design of asynchronous request in DEWS WCS

Dear all,

I really appreciate this discussion which touches several of the issues we have been discussing and facing in our research and development activity.

We have been developing OWS on SOAP; recently, we decided to play with some REST implementations (especially for asynch interactions). Therefore, I'd like to add some comments stemming from our understanding of REST and experience with it. Please, forgive the long content of this email; actually I put together Paolo's and my comments :-) .


Let me distinguish between the REST approach (the architectural style) and the RESTful implementation (the current technological solutions for implementing REST).

The REST approach proved to be highly scalable and sufficiently
flexible in many contexts, primarly the WEB infrastructure but also DB and filesystem access. In all these cases we have resources singularly addressed with a uniform interface.

Indeed the possible REST actions are limited by the uniform interface which tipically maps the simple CRUD (create, retrieve, update and delete) paradigm. Often simplicity means generality and flexibility (see the netCDF data model case); in fact, this simplicity was one of the reason for the WEB pervasive success and for its scalability. On the other hand, advanced semantic actions (e.g. resource processing actions) must be mapped to the basic CRUD vocabulary.

For example in the DB domain we can use SQL: a DB is the resource domain; the uniform interface is made of SELECT/INSERT/CREATE/UPDATE/DELETE methods; resource-IDs are all the possible SQL "WHERE" clauses. For the WEB (which may be seen as a globally distributed DB), resource-IDs are the WEB URIs (i.e. the "WEB clauses"). In both cases the resource-ID may become really complex (i.e. very long KVP strings; or complex SQL JOIN SELECTS) and, hence, it may be difficult to efficiently manage these IDs. For a REST WCS implementation (at the abstract level; no implementation details), resource-IDs are the GetCoverage clauses (analogous to the "SELECT" request content).

In our opinion, this is the real asset/limitation of REST: the application business logic must be faced and partially addressed at the interaction level (the protocol level), leaving the rest of the business logic to the server which, consequently, may result simpler (almost any Institution can manage a WEB server, today). With the Service-oriented approach, the entire application business logic is left to the server (i.e. the service provider) implementing a even simpler interaction: Exchange/Send an Electronic Document. Thus, SOA guarantees high flexibility, but the server (the service provider) has to face all the resource-related issues (e.g. resource caching, ID, creation, encoding, etc.) anyway.

Thus REST focus is on uniform interface and resource addressing not on resources nature (discrete, existing, etc.). If we can provide a uniform interface and a complete resources addressing we can adopt a REST architecture. In our opinion WCS seems to be implicitly based on a uniform interface (since we GET coverages, GET coverages descriptions and GET server capabilities and we do not explicitly define other action like INTERPOLATE, SUBSET, etc.), allowing to address each resource. Hence, a REST architecture seems an effective choice for this domain.


As to RESTful implementation for Geospatial resources, several issues must be considered.

First of all we should define what "resource" and " resource representation" are in this domain. We could decide that a dataset is the resource and all the features extracted from the dataset through interpolation, subsetting and resampling are simply different representations. In such case we should only address the dataset with a known URI and possibly create new resources if required. On the other hand we could consider each feature extracted from a dataset as a different resource. In such case we should address each feature with a different URI.

Presently, we are working on this second approach for some reasons: for theoretical consistency (according to the Web architecture a representation should only affect formats), and for implementation reasons (different URIs could support server-side caching).

Concerning the addressing problem we do not need to explicitly define URIs for each possible feature. We can simply provide a functional mapping between a URI-space and resource representations. In the OWS the URL-encoding of KVP string in a GET request IS the resource addressing. The fact that the feature is dynamically created is not an architectural problem but an implementation issue which might require smart caching servers.

For example:

http://someserver.net/wcs?name=foo&bbox=-180,-90,180,90&;...

is the URI for the feature extracted by the coverage named "foo" with the interpolation, subsetting and resampling defined by bbox (and other) parameters. (A better URI could be defined leaving only non-hierarchical parameters in the query part of the URI. Something like:

http://someserver.net/coverages/foo?bbox=-180,-90,180,90&;...

)

When the request is encoded in a POST it should be considered as a query to the root resource which responds with the representation of the target resource. This could also be viewed as an extraction-from-dataset service; however, this may introduce useless complexity since the request is still a GET action. In fact, there exists an implicit hierarchy of our features, and the root feature (the "foo" coverage in our example) doesn't support only its own GET operation, but also the selection of its children via a POST operation.

These considerations seem to be valid not only for WCS but for all the data access services (e.g. WCS, WFS and WMS). They conform to a resource-oriented approach and can be implemented in a RESTful architecture with "minimal" modifications of existing specifications. Besides, the RESTful implementation might be easily adopted by data providers, since it should be based on well-known technologies.

The case of WPS and WCTS seems to be different. In fact, they don't define a uniform interface for the many operations they should support; on the contrary, they introduce a uniform interface to receive a message which contains specific operation requests. In this case we should use the POST method as the extension point for interaction with HTTP based services which create new addressable resources (a sort of ending point in the SOA view). In such a way we should have the advantages of pervasive and scalable data provision (through the RESTful implementation) and modular and composable processing (through the service-oriented architecture).


Some possible conclusions:

A RESTful implementation is valuable for scalability and extensibility (derived by the REST architectural style) as well as for simplicity (the implementation is simple since it is based on well-known technology and only simple operations must be supported server-side)

The RESTful implementation seems feasible for data access services because they are typically resource-based.

The RESTful architecture must interact with a Service-oriented architecture for basic and advanced processing. XML and HTTP are the key technologies for bridging.




Thank you for your patience,

Stefano and Paolo