Re: [thredds] THREDDS Data Server serving from Amazon S3

  • To: James Gallagher <jgallagher@xxxxxxxxxxx>
  • Subject: Re: [thredds] THREDDS Data Server serving from Amazon S3
  • From: John Caron <caron@xxxxxxxx>
  • Date: Wed, 15 Jul 2015 15:01:41 -0600
thanks for the paper, very useful.

On Tue, Jul 14, 2015 at 3:44 PM, James Gallagher <jgallagher@xxxxxxxxxxx>
wrote:

>
> > On Jul 14, 2015, at 12:37 PM, Signell, Richard <rsignell@xxxxxxxx>
> wrote:
> >
> > Folks,
> > I discovered some info here on OPeNDAP and S3 from the hyrax boyz, in
> > case of interest:
> > http://docs.opendap.org/index.php/OPULS:_NOAA_S3_Data_Access
>
> And here’s a short paper we wrote on it:
>
>
> James
>
> PS. after the work on S3 we also expanded the code to use Glacier - a
> whole different kettle of fish.
>
> >
> > On Tue, Jul 14, 2015 at 3:35 PM, Robert Casey <rob@xxxxxxxxxxxxxxxxxxx>
> wrote:
> >>
> >> Hi Jeff-
> >>
> >> Of note, Amazon Glacier is meant for infrequently needed data, so a
> call-up
> >> for data from that source will require something on the order of a 5
> hour
> >> wait to retrieve to S3.  I think they are developing a near-line storage
> >> solution that is a bit more expensive to compete with Google's new
> near-line
> >> storage, which provides retrieval times on the order of seconds.
> >>
> >> -Rob
> >>
> >> On Jul 14, 2015, at 10:10 AM, Jeff McWhirter <jeff.mcwhirter@xxxxxxxxx>
> >> wrote:
> >>
> >> On this note -
> >> What I really want is a file system that can transparently manage  data
> >> between primary (SSD), secondary (S3) and tertiary (Amazon Glacier)
> stores.
> >> Actively used data would migrate into primary storage. Old archived data
> >> moves off into cheaper tertiary storage. I've thought of implementing
> this
> >> at the application level in RAMADDA but a file system based approach
> would
> >> be much smarter.
> >>
> >> How do the archive folks on this list manage these kinds of storage
> >> environments?
> >>
> >> -Jeff
> >>
> >>
> >>
> >>
> >> On Tue, Jul 14, 2015 at 10:44 AM, John Caron <caron@xxxxxxxx> wrote:
> >>>
> >>> Hi David:
> >>>
> >>> At the bottom of the TDM, we rely on RandomAccessFile. Do you know if
> S3
> >>> supports that abstraction (essentially posix file semantics, eg seek(),
> >>> read()) ? My guess is that S3 only allows complete file transfers (?)
> >>>
> >>> Would be worth investigating if anyone has implemented a java
> >>> FileSystemProvider for S3.
> >>>
> >>> Will have a closer look when i get time.
> >>>
> >>> John
> >>>
> >>> On Mon, Jul 13, 2015 at 7:59 PM, David Nahodil <
> David.Nahodil@xxxxxxxxxxx>
> >>> wrote:
> >>>>
> >>>> Hi all,
> >>>>
> >>>>
> >>>> We are looking at moving our THREDDS Data Server to Amazon EC2
> instances
> >>>> with the data hosted on S3. I'm just wondering if anyone has tried
> using TDS
> >>>> with data hosted on S3?
> >>>>
> >>>>
> >>>> I had a quick back-and-forth with Sean at Unidata (see below) about
> this.
> >>>>
> >>>>
> >>>> Regards,
> >>>>
> >>>>
> >>>> David
> >>>>
> >>>>
> >>>>>> Unfortunately, I do not know of anyone who has done this, although
> we
> >>>>>> have had at lease one other person ask. From what I understand,
> there is a
> >>>>>> way to mount an S3 storage as a virtual file system, in which case
> I would
> >>>>>> *think* that the TDS would work as it normally does (depending on
> the kind
> >>>>>> of data you have).
> >>>>
> >>>>
> >>>>> We have considered mounting the S3 storage as a filesystem and
> running
> >>>>> it like that. However, our feeling was that the tools were not really
> >>>>> production ready and that we're really misrepresenting S3 by
> pretending it
> >>>>> is a file system. So this is why we're investigating if anyone has
> used TDS
> >>>>> with the S3 API directly.
> >>>>
> >>>>
> >>>>>> What kind of data do you have? Will your TDS also be in the cloud?
> Do
> >>>>>> you plan on serving the data inside of amazon to other EC2
> instances, or do
> >>>>>> you plan on crossing the cloud/commodity web boundary with the
> data, in
> >>>>>> which case that could get very expensive quite quickly?
> >>>>
> >>>>
> >>>>> We have about 2 terabytes of marine and climate data that we are
> >>>>> currently serving from our existing infrastructure. The plan is to
> move the
> >>>>> infrastructure to Amazon Web Services so TDS would be hosted on EC2
> machines
> >>>>> and the data on S3. We're hoping this setup should work okay, but we
> might
> >>>>> still have a hurdle or two to come. :)
> >>>>
> >>>>
> >>>>> We have someone here who once wrote a plugin/adapter for TDS to work
> >>>>> with an obscure filesystem that our data used to be stored on. So we
> have a
> >>>>> little experience in what might be involved in what might be
> involved for
> >>>>> doing the same with S3. We just wanted to make sure that if anyone
> had done
> >>>>> some work already that we made use of that.
> >>>>
> >>>>
> >>>>>> We very, very recently (as in a day ago) got some Amazon resources
> to
> >>>>>> play around on, but we won't have a chance to kick those tires
> until after
> >>>>>> our training workshops at the end of the month.
> >>>>
> >>>>
> >>>>
> >>>> University of Tasmania Electronic Communications Policy (December,
> 2014).
> >>>> This email is confidential, and is for the intended recipient only.
> >>>> Access, disclosure, copying, distribution, or reliance on any of it by
> >>>> anyone outside the intended recipient organisation is prohibited and
> may be
> >>>> a criminal offence. Please delete if obtained in error and email
> >>>> confirmation to the sender. The views expressed in this email are not
> >>>> necessarily the views of the University of Tasmania, unless clearly
> intended
> >>>> otherwise.
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> thredds mailing list
> >>>> thredds@xxxxxxxxxxxxxxxx
> >>>> For list information or to unsubscribe,  visit:
> >>>> http://www.unidata.ucar.edu/mailing_lists/
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> thredds mailing list
> >>> thredds@xxxxxxxxxxxxxxxx
> >>> For list information or to unsubscribe,  visit:
> >>> http://www.unidata.ucar.edu/mailing_lists/
> >>
> >>
> >> _______________________________________________
> >> thredds mailing list
> >> thredds@xxxxxxxxxxxxxxxx
> >> For list information or to unsubscribe,  visit:
> >> http://www.unidata.ucar.edu/mailing_lists/
> >>
> >>
> >>
> >> _______________________________________________
> >> thredds mailing list
> >> thredds@xxxxxxxxxxxxxxxx
> >> For list information or to unsubscribe,  visit:
> >> http://www.unidata.ucar.edu/mailing_lists/
> >
> >
> >
> > --
> > Dr. Richard P. Signell   (508) 457-2229
> > USGS, 384 Woods Hole Rd.
> > Woods Hole, MA 02543-1598
> >
> > _______________________________________________
> > thredds mailing list
> > thredds@xxxxxxxxxxxxxxxx
> > For list information or to unsubscribe,  visit:
> http://www.unidata.ucar.edu/mailing_lists/
>
> --
> James Gallagher
> jgallagher at opendap.org
> 406.723.8663
>
>
>
  • 2015 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: