Re: [thredds] THREDDS Data Server serving from Amazon S3

Hi James,


Out of interest, is there a NOAA Hyrax server you can point me to which is 
currently running the code from the OPULS project and serving data off S3?


Cheers,


David



________________________________
From: thredds-bounces@xxxxxxxxxxxxxxxx <thredds-bounces@xxxxxxxxxxxxxxxx> on 
behalf of John Caron <caron@xxxxxxxx>
Sent: Thursday, 16 July 2015 7:01 AM
To: James Gallagher
Cc: THREDDS THREDDS
Subject: Re: [thredds] THREDDS Data Server serving from Amazon S3

thanks for the paper, very useful.

On Tue, Jul 14, 2015 at 3:44 PM, James Gallagher 
<jgallagher@xxxxxxxxxxx<mailto:jgallagher@xxxxxxxxxxx>> wrote:

> On Jul 14, 2015, at 12:37 PM, Signell, Richard 
> <rsignell@xxxxxxxx<mailto:rsignell@xxxxxxxx>> wrote:
>
> Folks,
> I discovered some info here on OPeNDAP and S3 from the hyrax boyz, in
> case of interest:
> http://docs.opendap.org/index.php/OPULS:_NOAA_S3_Data_Access

And here’s a short paper we wrote on it:


James

PS. after the work on S3 we also expanded the code to use Glacier - a whole 
different kettle of fish.

>
> On Tue, Jul 14, 2015 at 3:35 PM, Robert Casey 
> <rob@xxxxxxxxxxxxxxxxxxx<mailto:rob@xxxxxxxxxxxxxxxxxxx>> wrote:
>>
>> Hi Jeff-
>>
>> Of note, Amazon Glacier is meant for infrequently needed data, so a call-up
>> for data from that source will require something on the order of a 5 hour
>> wait to retrieve to S3.  I think they are developing a near-line storage
>> solution that is a bit more expensive to compete with Google's new near-line
>> storage, which provides retrieval times on the order of seconds.
>>
>> -Rob
>>
>> On Jul 14, 2015, at 10:10 AM, Jeff McWhirter 
>> <jeff.mcwhirter@xxxxxxxxx<mailto:jeff.mcwhirter@xxxxxxxxx>>
>> wrote:
>>
>> On this note -
>> What I really want is a file system that can transparently manage  data
>> between primary (SSD), secondary (S3) and tertiary (Amazon Glacier)  stores.
>> Actively used data would migrate into primary storage. Old archived data
>> moves off into cheaper tertiary storage. I've thought of implementing this
>> at the application level in RAMADDA but a file system based approach would
>> be much smarter.
>>
>> How do the archive folks on this list manage these kinds of storage
>> environments?
>>
>> -Jeff
>>
>>
>>
>>
>> On Tue, Jul 14, 2015 at 10:44 AM, John Caron 
>> <caron@xxxxxxxx<mailto:caron@xxxxxxxx>> wrote:
>>>
>>> Hi David:
>>>
>>> At the bottom of the TDM, we rely on RandomAccessFile. Do you know if S3
>>> supports that abstraction (essentially posix file semantics, eg seek(),
>>> read()) ? My guess is that S3 only allows complete file transfers (?)
>>>
>>> Would be worth investigating if anyone has implemented a java
>>> FileSystemProvider for S3.
>>>
>>> Will have a closer look when i get time.
>>>
>>> John
>>>
>>> On Mon, Jul 13, 2015 at 7:59 PM, David Nahodil 
>>> <David.Nahodil@xxxxxxxxxxx<mailto:David.Nahodil@xxxxxxxxxxx>>
>>> wrote:
>>>>
>>>> Hi all,
>>>>
>>>>
>>>> We are looking at moving our THREDDS Data Server to Amazon EC2 instances
>>>> with the data hosted on S3. I'm just wondering if anyone has tried using 
>>>> TDS
>>>> with data hosted on S3?
>>>>
>>>>
>>>> I had a quick back-and-forth with Sean at Unidata (see below) about this.
>>>>
>>>>
>>>> Regards,
>>>>
>>>>
>>>> David
>>>>
>>>>
>>>>>> Unfortunately, I do not know of anyone who has done this, although we
>>>>>> have had at lease one other person ask. From what I understand, there is 
>>>>>> a
>>>>>> way to mount an S3 storage as a virtual file system, in which case I 
>>>>>> would
>>>>>> *think* that the TDS would work as it normally does (depending on the 
>>>>>> kind
>>>>>> of data you have).
>>>>
>>>>
>>>>> We have considered mounting the S3 storage as a filesystem and running
>>>>> it like that. However, our feeling was that the tools were not really
>>>>> production ready and that we're really misrepresenting S3 by pretending it
>>>>> is a file system. So this is why we're investigating if anyone has used 
>>>>> TDS
>>>>> with the S3 API directly.
>>>>
>>>>
>>>>>> What kind of data do you have? Will your TDS also be in the cloud? Do
>>>>>> you plan on serving the data inside of amazon to other EC2 instances, or 
>>>>>> do
>>>>>> you plan on crossing the cloud/commodity web boundary with the data, in
>>>>>> which case that could get very expensive quite quickly?
>>>>
>>>>
>>>>> We have about 2 terabytes of marine and climate data that we are
>>>>> currently serving from our existing infrastructure. The plan is to move 
>>>>> the
>>>>> infrastructure to Amazon Web Services so TDS would be hosted on EC2 
>>>>> machines
>>>>> and the data on S3. We're hoping this setup should work okay, but we might
>>>>> still have a hurdle or two to come. :)
>>>>
>>>>
>>>>> We have someone here who once wrote a plugin/adapter for TDS to work
>>>>> with an obscure filesystem that our data used to be stored on. So we have 
>>>>> a
>>>>> little experience in what might be involved in what might be involved for
>>>>> doing the same with S3. We just wanted to make sure that if anyone had 
>>>>> done
>>>>> some work already that we made use of that.
>>>>
>>>>
>>>>>> We very, very recently (as in a day ago) got some Amazon resources to
>>>>>> play around on, but we won't have a chance to kick those tires until 
>>>>>> after
>>>>>> our training workshops at the end of the month.
>>>>
>>>>
>>>>
>>>> University of Tasmania Electronic Communications Policy (December, 2014).
>>>> This email is confidential, and is for the intended recipient only.
>>>> Access, disclosure, copying, distribution, or reliance on any of it by
>>>> anyone outside the intended recipient organisation is prohibited and may be
>>>> a criminal offence. Please delete if obtained in error and email
>>>> confirmation to the sender. The views expressed in this email are not
>>>> necessarily the views of the University of Tasmania, unless clearly 
>>>> intended
>>>> otherwise.
>>>>
>>>>
>>>> _______________________________________________
>>>> thredds mailing list
>>>> thredds@xxxxxxxxxxxxxxxx<mailto:thredds@xxxxxxxxxxxxxxxx>
>>>> For list information or to unsubscribe,  visit:
>>>> http://www.unidata.ucar.edu/mailing_lists/
>>>
>>>
>>>
>>> _______________________________________________
>>> thredds mailing list
>>> thredds@xxxxxxxxxxxxxxxx<mailto:thredds@xxxxxxxxxxxxxxxx>
>>> For list information or to unsubscribe,  visit:
>>> http://www.unidata.ucar.edu/mailing_lists/
>>
>>
>> _______________________________________________
>> thredds mailing list
>> thredds@xxxxxxxxxxxxxxxx<mailto:thredds@xxxxxxxxxxxxxxxx>
>> For list information or to unsubscribe,  visit:
>> http://www.unidata.ucar.edu/mailing_lists/
>>
>>
>>
>> _______________________________________________
>> thredds mailing list
>> thredds@xxxxxxxxxxxxxxxx<mailto:thredds@xxxxxxxxxxxxxxxx>
>> For list information or to unsubscribe,  visit:
>> http://www.unidata.ucar.edu/mailing_lists/
>
>
>
> --
> Dr. Richard P. Signell   (508) 457-2229<tel:%28508%29%20457-2229>
> USGS, 384 Woods Hole Rd.
> Woods Hole, MA 02543-1598
>
> _______________________________________________
> thredds mailing list
> thredds@xxxxxxxxxxxxxxxx<mailto:thredds@xxxxxxxxxxxxxxxx>
> For list information or to unsubscribe,  visit: 
> http://www.unidata.ucar.edu/mailing_lists/

--
James Gallagher
jgallagher at opendap.org<http://opendap.org>
406.723.8663<tel:406.723.8663>



  • 2015 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: