[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Problem with aggregation and stateless DAP





James Gallagher wrote:
I talked with Benno about this at AGU (I think) and forgot to relay that conversation to this thread. What I remember was that Benno argued for use of an ETag and/or Last-Modified along with Expires headers. This has the benefit that the DAP stays stateless and it would work.

However... It seems to me that state is ultimately an optimization. I'd like to reopen the discussion and make sure that we get a cogent statement about the situation written somewhere. The answer might be in my email somewhere, but I was going to write a quick summary on the wiki and after scanning the messages for a bit, I couldn't. So...

John: Can you send a description of your idea about adding state and
Benno: Can you send me a description of your idea about using ETag, et cetera?

This would help me greatly. Thanks!

James

Heres a summary of where things stand.

The problem is to deal with datasets that are continually changing. In that 
case, simply detecting that they have changed is inadequate, since then you 
have to notify the application/user, invalidate the application state, etc. 
This is ok if it happens occasionally, but what if it happpens all the time? 
You have an unuseable application for those datasets.

Putting (optional) state on the server is a way to deal with it. It is an 
optimization.



-------- Original Message --------
Subject: Re: More thoughts about datasets that change
Date: Thu, 15 Dec 2005 18:30:34 -0700
From: John Caron <address@hidden>
Organization: UCAR/Unidata
To: Tech DODS <address@hidden>
References: <address@hidden> <address@hidden> <address@hidden> <address@hidden> 
<address@hidden>



John Caron wrote:


James Gallagher wrote:


Suppose we introduce the idea that a client MAY get a cookie from a server, indicating that a session as been created by it's request. It MAY include that cookie with future requests, indicating to a server that it requests the server honor the previous session, making the current response relative to the data source's state at the time the cookie was issued (I know, I used the word ;-). In this case the server MUST either honor the request OR return an error. There's no requirement that a server support sessions and no requirement that a client support them either. The DAP still has fundamentally stateless behavior, but also has the capability to tell certain servers, 'Hey, I was here before and this is what things looked like then.' Most servers won't support this, and neither will most clients (my guess) but some will and it looks like an important capability.


Yes, this would solve my problem in the Agg Server, thanks for summarizing it concisely. You might add that if the session is established, the server should make a best effort not to let the dataset change during the duration of the session.

I will go ahead and try to implement this in the Agg Server, and report on any further problems encountered before we make a final decision on it. Im pretty sure it will be easy enough to return cookies in the client libraries, so I will likely also modify the java netcdf/opendap client library. That way I will have a complete test of the idea.


Heres a summary of where I am on this project:


If you remember, the problem is to keep the opendap dataset metadata from 
changing in a way that gives erroneous results silently, for the case that a 
client has a stateful view of the dataset, e.g. the netcdf-opendap clients.

In my server, I had already implemented caching of datasets, so that repeated 
requests to the same dataset would be efficient. Normally I would lock the 
dataset for the duration of the request, then immediately release it. What I do 
now is to reserve the dataset object for that particular client by not 
releasing it until the session expires. I then know what metadata the client 
assumes, and can satisfy that if possible, and give an error message if not.

It was quite easy to modify Java DODS DConnect class to add support for session 
cookies. Just a few lines of code, and it only happens if the client enables it.

In the normal session negotiation, the server offers a cookie when it gets the 
first client request, and if the client returns the cookie, the server 
establishes the session. In the Tomcat framework, every request generates a 
session object, which times out if the cookie is not returned. I was worried 
somewhat about the overhead of this. More importantly, it meant that I couldnt 
immediately lock the dataset, but had to wait to see if the client returned the 
cookie. For those 2 reasons, I decided to have the client send a header to 
indicate that  it would, indeed accept cookies. So when the client sends a 
request, it adds the header:

X-Accept-Session: true

This allows me to immediately lock the dataset, and to not bother creating a 
session if the header is absent. I put the dataset object into the session 
object, and retrieve it every request. The session automatically times out 
after 30 minutes.

It seemed silly to have the dataset stay locked an extra 30 minutes when most of the time 
the client library knows exactly when its done, namely when the client calls close() on 
the dataset. So I wanted to send a message when close() was called. I decided to just add 
a new suffix ".close" to the dataset name on a GET call. When the server sees 
this, it unlocks the dataset and terminates the session.

So in summary:

Java-DODS Client Library 1. Add "X-Accept-Session: true" header on each request.
2. Look for any cookies on the response. Return them on subsequest requests.
3. A new method close() was added. If its called, send a message to server with dataset name 
and ".close" suffix".

TDS Server
1. Look for X-Accept-Session: true header on each request.
2. If exists, establish a session for the client, return a session id cookie.
3. Cache the dataset object for the duration of the session.
4. Detect any changes to the dataset and give an error back to the client if 
needed.
5. On a "close" message, close the session and release the dataset.
6. Timeout any session that hasnt been used for 30 minutes.

I am willing to change any of this if needed, but this is what I found to give 
me the best results in my server and client.