Re: [python-users] Questions about THREDDS Data Server

To: "Keem, Munsung" <munsung-keem@xxxxxxxxx>
Subject: Re: [python-users] Questions about THREDDS Data Server
From: Ryan May <rmay@xxxxxxxx>
Date: Thu, 11 Feb 2016 16:44:22 -0700

Maybe someone else on the list may have expertise on the working from
Python. My responses are inline below:

On Thu, Feb 11, 2016 at 3:50 PM, Keem, Munsung <munsung-keem@xxxxxxxxx>
wrote:
>
> So, my question is,
>
> 1.       Is there any way to read and process LEVEL II data using rls c
> library in c program through TDS without downloads of the data?
>
> As I know, if we use OpenDAP c API, it is possible to access LEVEL II data
> on OpendDAP server through c program.
>
> However, in this case, as I said, data structure looks different from raw
> LEVEL II data, so we need to modify our algorithms a lot since our
> algorithms are developed based on rsl c library.
>
> The most ideal case for us is that we access raw LEVEL II data through
> TDS, read the data using rsl c library, process them using our c programs,
> and download the final result file.
>
> I am wondering whether or not this method is possible. Could you share
> your expertise about this problem?
>

The only way to get the raw Level II data from TDS is to use the HTTPServer
method, which requires downloading the data. Running the process on an EC2
instance (using the raw data) would be pretty straightforward. (I have no
idea about Opendap C api.)

> 2.       If this way is impossible, another possible way may be to use
> python to read LEVEL II data through TDS, and connect them to c programs.
>
> This way is similar to the way to use OpenDAP c API, but in this case, all
> necessary programs are already developed.
>
> Also, as I know, there is a way to deliver python objects to c program. Am
> I right?
>
> However, I question the execution speed of this method.
>
> I think simply downloading through TDS and processing LEVEL II data with
> our c programs could be faster, but I cannot assure it. Could you tell me
> how you think of this way?
>
> (For this issue, I tested the reading speed using pyart packages. When I
> access LEVEL II data through TDS and read them through
> pyart.io.read_nexrad_cdm,
>
> It takes more time than simply downloading)
>

read_nexrad_cdm has to do a lot of data manipulation and individual
requests (not to mention the TDS has to read and manipulate the data) so
I'm not surprised it's slow.

> 3.       Finally, I would like to know when I download LEVEL II data
> through TDS into AWS EC2 instance, is there any cost for downloading data
> regardless of Availability Zone and Region?
>
> I wonder the difference between using TDS and AWS S3 for downloads of the
> data. In the case AWS S3, if my EC2 instance is located in different Zone
> or Region, some cost is charged.
>
> However, I am not sure about TDS.
>

Data transferred into your instance from S3 would be free (and at worst
cost AWS, who pays for the S3 bucket). Data transferred from TDS would cost
us between $0.01-$0.02 / GB, depending on zone/region. (See "Data Transfer"
here: https://aws.amazon.com/ec2/pricing/). Even in the same region and
zone it would cost unless we figure out how to access over Unidata's EC2
instance *private* IP address. You'd pay more for data out of your EC2
instance, which is usually $0.09 / GB.

My recommendation would be to spin up an EC2 machine, download the data
there, and run your original code there. If we wanted to avoid some of the
transfer costs (really only an issue if you guys need lots of TB's of
data), it would be possible to take the returns from the TDS radar query
service (the catalog) and turn them into links to the original S3 files
trivially.

Ryan

-- 
Ryan May, Ph.D.
Software Engineer
UCAR/Unidata
Boulder, CO

References:
- [python-users] Questions about THREDDS Data Server
  - From: Keem, Munsung

2016 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the python-users archives: