Re: [thredds] nco as a web service

Hi All,
I guess one of the points that advocates of "processing near the data" are missing is that many interesting processes involve integrating multiple datasets which are often not in one place to begin with. You would have to move data anyway, may be not all the datasets, but atleast some of them.


On 6/18/2012 5:22 PM, Ben Domenico wrote:
Hi Jeff,

I agree that, in many cases, the processing needs to be near the data, but that does not rule out using a brokering layer. The broker, in fact can be set up to run on the same network or even the same machine as the data server. The idea is just that it communicates via web services which means that it is easier to have some of the development take place with different languages, compilers, even different development teams. It just doesn't all have to be part of the same server program. That's the beauty of using a third tier between client and server.

-- Ben

On Mon, Jun 18, 2012 at 3:06 PM, Jeff McWhirter <jeff.mcwhirter@xxxxxxxxx <mailto:jeff.mcwhirter@xxxxxxxxx>> wrote:

    Hi Ben,

        This is a terrific idea.  One suggestion I have is to build it
        so the processing services can be set up in a brokering layer
        -- that is, so the input datasets can be accessed via web
        services and the output can be served via web services.  I
        don't  mean that this should be the only way to implement the
        nco processing, rather just keep it in mind so it's relatively
        easy to set up such a three tier architecture for the nco

    I just heard from Charlie Zender and have confirmed that the NCO
    routines can operate on opendap URLs. This opens up numerous
    possibilities. In the context of ramadda one can have explicit
    opendap links, e.g.:

    All of the ramadda data services (cataloging, metadata ingest,
subset, nco (soon), grid visualizations, etc) are available for that opendap link.

    However, we have to keep in mind performance ramifications. It
    still takes a long time to move gigabytes of data across a
    network. This brings up the importance of moving the computation
    to the data, instead of moving the data to the computation. For
    some data sets and many use cases remote access to data works very
    well so things like brokering are tractable. However, for *big*
    data sets (e.g., climate model output) we need to come up with
    richer mechanisms (like the NCO on local data) to bring
    computation to the data.


thredds mailing list
For list information or to unsubscribe,  visit: