Re: Problem with aggregation and stateless DAP

To: Tennessee Leeuwenburg <t.leeuwenburg@xxxxxxxxxx>
Subject: Re: Problem with aggregation and stateless DAP
From: John Caron <caron@xxxxxxxxxxxxxxxx>
Date: Mon, 05 Dec 2005 17:00:02 -0700



Tennessee Leeuwenburg wrote:

Hi John,
So long as you don't break simple stateless requests, I'm happy withwhatever you choose in order to provide stateful behaviour also -- I cansee the need for this now. The rest of this email contains my thoughtsabout other ways to look at the problem.
I wonder why this is the case. What is the new file being added? Couldyou please explain exactly what's being added to me? Why are new filesbeing added?


Its a realtime data feed from the IDD. In this particular dataset, its 
satellite files that come in every 15 minutes, constituting a  time series of 
satellite images.

As for the deletion at 0Z, I would ask whether the request is for"Latest" or for a specific date. I don't see why files for a specificdate would be removed, for example.


We only keep 7 days worth, so each night we delete the oldest day's worth.

However, I'll just assume you do need to do what you're doing for themoment. I guess a session is the only way. However, not all clients aregoing to be aware, so you might still need some way to handle things onthe server when people *do* ask for things without a session.


If the client doesnt assume any state, I can easily fulfill the request with 
the dataset's state at the moment the request comes in.

What about introducing an "unlimited" vector into the aggregation? Ifyou joined along a new dimension, then requests (for example) the first10 records would always come back with the same thing, even though moredata might now be available "at the other end". If you know your data isgoing to be highly dynamic, then this doesn't seem unreasonable. Youmight even implement a new kind of request for new additions to an oldaggregation.


yes, thats exactly the situation, i have an "unlimited" time dimension, and usually the 
file is just growing, so the client with the "old" DDX doesnt get any bad data.

However, in principle the data might arrive out of order, and for sure we have 
the problem when the files get deleted. Its really these cases, that happen 
less frequently, that need special attention.

I may implement an operation that says "check to see if the dataset has grown along 
the unlimited dimension".

I've quoted a bit from your other email, which prompted me to go anre-examine this one.
>The problem is; everytime you do a data request, will you examine theentire DDX and possibly some of the data like coordinate systems to makesure nothing has changed?
I wouldn't bother usually, but if I was setting something up to trackthe changes I *could* do it.



Yes, you could do it, but most clients wont.

I'm also thinking of the scenario where auser is asking for the file as NetCDF and saving to local disk. If theinformation only exists in the DDX, might not that be a problem?


Well, you culd ask for the data in a single request, but it will likely fail 
because currently things get buffered in memory.
So if you break the data requests into seperate requests, you have the 
possibility of things changing while youre bringing the data over.

Here's another option -- what if each DDX contained a unique identifier-- such as the date and time to high precision? Further requests wouldinclude this as a "currency" indicator. The server would then onlyaggregate files which themselves have a creation date before thatindicator.


hmmm, sounds like a variation of the checksum idea. ill have to think about 
that.

thanks for your ideas!

References:
- Re: Problem with aggregation and stateless DAP
  - From: Tennessee Leeuwenburg

2005 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the thredds archives: