Re: [wcsplus] Design of asynchronous request in DEWS WCS

Hi Jon,

Thanks for these details on your DEWS WCS. I'd seen some of these details before in a paper Jeremy gave us but after all this asynchronous discussion it is making a bit more sense to me.

Comments below.

Jon Blower wrote:
Hi all,

As Adit said in an earlier post, we designed and built a WCS with
asynchronous capability in DEWS.  Thought it might be useful to
summarize on this list the key features of the design, which borrows
from the WPS spec.

The asynchronous behaviour is specified by two parameters, STORE and
STATUS, which both default to false (meaning that we are
backward-compatible with WCS1.0.0).

STORE=true means "Give me a URL to the data instead of the data itself"
STATUS=true means "Let me monitor the extraction process"

There are three possible behaviours (and one that makes no sense and
is disallowed):
(1) STORE=false and STATUS=false.  "Fully-synchronous."  The server
waits until the data have been extracted and then replies to the
client with the data as a direct response to the request.
(2) STORE=true and STATUS=false.  "Semi-synchronous."  The server
waits until the data have been extracted and responds to the client
with a URL to the data file.
(3) STORE=true and STATUS=true.  "Asynchronous."  The server replies
*immediately* with a document containing a unique job ID.  The client
can then poll the server using this job ID to discover information
about the progress of the extraction.  When the extraction is complete
the polling results in the URL to the data file being returned to the
client.

One case that seems important to me but is not satisfied by the above is when the client would prefer the data returned synchronously but still wants the data even if the server can't return it till later. Or should that be handled by a combination of the above? Which makes me think that some kind of negotiation is needed. For instance, what about the client that wants the data now or within an hour but doesn't want it if it will take a day.

Perhaps a fully-synchronous request that returns an exception (maybe with a new "AsynchronousResponseRequired" code) followed by an asynchronous request. It would be nice if the client could get an estimate of how long the asynchronous response might take but the exception reports don't really support structured information beyond the code and location.

Though this make me think of "use exceptions only for exceptional situations" (from "Effective Java", Josh Bloch). So maybe a different kind of negotiation mechanism would be better.

By the way, what HTTP response code do you guys use for the above store/status situations? And do you use the 400 (Bad Request) when you return an exception? As I work on implementations, I keep wishing the WCS (and other OGC) spec(s) were more clear about HTTP response codes. So, as we continue with this asynchronous response discussion, I'd like to make sure we have some detail on the HTTP response codes.

Jon, this leads me to an earlier comment of yours,

   "I'm not sure about the use of HTTP response codes and RESTful
   paradigms to manage the asynchronous download (I'm a fan of REST in
   general by the way). I would recommend thinking carefully about the
   complexities that this design would impose on the design of clients
   (the same goes for a "serverDecide" option in the asynchronous
   parameter of the WCS request)."

Can you explain this a bit more? Since we are working on top of HTTP, this seems like a natural way to go. Do you see the complexity arising from: 1) the lack of detail on the content of a 202 (Accepted) response, 2) the client having to check the response code before deciding how to deal with the response, or 3) something else?


STORE=false, STATUS=true makes no sense and is disallowed (server
responds with an error).

The server can respond with an error if it does not wish to satisfy a
fully- or semi-synchronous request (because the data extraction will
take too long, for example).  We chose this design instead of a
"serverDecide" option because it simplifies the client (the client
always knows what kind of response to expect).

For me, the three opposing forces, in order of importance, are: 1) make sure the spec is clear; 2) allow for simple, easy to implement, clients; 3) allow for simple, easy to implement, servers. I worry that striving for a simple client could make the spec less understandable by those that will be

As Adit says, the format of the status documents was inspired by the
WPS ExecuteResponse document.  I think we diverged from this for
reasons of expediency, but I think the ER format is logical: a large
data extraction is conceptually the same as a long-running processing
job.

I haven't looked at the ExecuteResponse document yet but I'm all for using an existing spec.

There will be many possible designs but at least we know this one
works (at least for us!) and it is close to an existing OGC spec
(WPS).  The DEWS WCS is a "reference implementation" for this design
and we're happy to share the code (Java web app).

Thanks again for going into some details on the DEWS WCS implementation.

Thanks everyone for a great discussion.

Ethan

Hope this helps,
Jon


--
Ethan R. Davis                                Telephone: (303) 497-8155
Software Engineer                             Fax:       (303) 497-8690
UCAR Unidata Program Center                   E-mail:    edavis@xxxxxxxx
P.O. Box 3000
Boulder, CO  80307-3000                       http://www.unidata.ucar.edu/
---------------------------------------------------------------------------