Re: [wcsplus] more on asynchronous response

Fair points Jon; sounds like you went for something siliar to what I was
implying with a 'canonicalized' request.

You're right though; implementation issues. I try to avoid these :-)


-----Original Message-----
From: jon.blower@xxxxxxxxx [mailto:jon.blower@xxxxxxxxx] On Behalf Of Jon Blower
Sent: 26 October 2007 15:52
To: Tandy, Jeremy
Cc: Ethan Davis; Paolo Mazzetti; wcsplus@xxxxxxxxxxxxxxxx
Subject: Re: [wcsplus] more on asynchronous response

Dear all,

Good discussion, sorry I'm joining late.  (I agree with Dom's comment
regarding use of the WPS ExecuteResponse document by the way and was
about to post essentially the same message!)

Regarding caching of requests and particularly Jeremy's comment:

This is pretty easy to resolve; if you want to cache the response for=20
multiple usages I recommend creating an MD5 hash of the
request & using this as the key in a key-value-pair map; i.e. you use=20
the request hash to look up a previous response (if any).

I've applied a limited portion of my limited brainpower to this in the
past for another project and unfortunately building an effective cache
is a little more complicated than this.  (although this is an
implementation issue and I guess doesn't affect the WCS+ specification
per se).  There are two complicating factors:

1) In OGC services, parameter names are case-insensitive, values are
not.  Also many parameters are optional, and still others have no
relevance to the data extraction itself (e.g. what if only the output
format is different?  Could you cache the raw data and convert on the
fly?  This is probably a good idea and works well for my WMS
implementation).  All these things mean that you can't simply hash the
query string map without a little extra logic.

2) The BBOX parameter causes issues because slightly different BBOX
parameters might lead to identical data extractions (if the difference
in the BBOX values is smaller than a grid cell for example).  If you
simply do a string comparison you'll end up missing these cases, which
are very common in practice.

So you need a custom cache if you want it to be optimal (a naive cache
might do the job in some cases though).  My approach was to convert the
query string into a low-level set of data extraction parameters (i.e.
the parameters that are passed to NetCDF libraries for example, to
extract a block of data) and cache these low-level parameters instead.
These parameters typically consist of a file name, internal variable id
and a set of indices for each axis in the data file.  Your system will
then parse the query string into these low-level parameters and check
for identical parameters in the cache.  BTW I would recommend caching
the raw data array to allow people to download the same data in
different formats without doing the extraction twice.

Moving on, I'm not sure about the use of HTTP response codes and RESTful
paradigms to manage the asynchronous download (I'm a fan of REST in
general by the way).  I would recommend thinking carefully about the
complexities that this design would impose on the design of clients (the
same goes for a "serverDecide" option in the asynchronous parameter of
the WCS request).  Sorry I don't have time to elucidate but every little
bit of extra complexity required of a client would drastically reduce
the number of clients that get developed.  One server, many clients:
keep the client simple.