[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Datastream #XAJ-729154]: Re: [ldm-users] 20140320: Impending changes to IDD FNEXRAD and UNIWISC datastreams



Hi Rodger,

Long time no hear!  I hope you are doing well...

re:
> I was interested to read your statement "Both datastreams are now being
> created in a 64-bit CentOS 6.5 VM in the Amazon EC2 cloud.". Have you
> written up anything to describe how this is being done?

No, but it is pretty simple:

- we contracted for a dual vcore instance with 7.5 GB of RAM and 400 GB
  of disk space

- we opted to use a CentOS 64-bit VM for this instance

  This is exactly the same environment I "live in" on my Windows 7 laptop
  except that I run VMware Player in Windows and then CentOS 6.5 6-bit
  in it.  This is my McIDAS development environment, and it goes with
  me everywhere I take my laptop.

- we built LDM, GEMPAK and McIDAS in the VM and then ported the various
  mostly cron-based scripts that generate the FNEXRAD Level III national
  composites and Unidata-Wisconsin satellite image sectors to the
  instance

- the last step in the generation of each product is compressing the
  product and then inserting it into the LDM queue in the 'ldm'
  account

- three Unidata machines here in UCAR are REQUESTing all FNEXRAD and
  UNIWISC (aka MCIDAS) products from the Amazon instance

  The REQUESTing machines are all front ends for top level IDD relay
  nodes that we maintain.  The products then flow from our top level
  relay node (a multi-machine cluster) to machines that REQUEST the
  datastreams, and so on.

re:
> I assume that you are running the LDM on Amazon EC2.

Yes.

re:
> How is that economically possible
> given the large amount of network traffic that you must be passing
> to/from Amazon?

The traffic volume coming out of the instance in EC2 is fairly modest.
For reference, here is a snapshot of volumes of LDM traffic from
our EC2 instance:

Data Volume Summary for amazon.ecw2_1.unidata.ucar.edu

Maximum hourly volume    599.236 M bytes/hour
Average hourly volume    523.993 M bytes/hour

Average products per hour      30072 prods/hour

Feed                           Average             Maximum     Products
                     (M byte/hour)            (M byte/hour)   number/hour
NEXRAD3                 338.149    [ 64.533%]      386.060    29907.957
FNEXRAD                  91.746    [ 17.509%]      114.633      107.021
UNIWISC                  73.478    [ 14.023%]      121.949       47.511
EXP                      20.497    [  3.912%]       32.074        7.830
HDS                       0.122    [  0.023%]        0.124        1.957

The datastreams flowing into the Amazon EC2 instance are NEXRAD3 (but
only the products that would make sense as national composites), EXP
(we are being fed the ARCTIC/ANTARCTIC composites from UW/SSEC/AMRC)
and a very small number of HDS products.  Only FNEXRAD and UNIWISC are
flowing out of the cloud, so the volumes are very modest in comparison
to what flows in the IDD as a whole.

re:
> Just curious since almost everything seems to be moving
> to the cloud these days..

Yup.  Given that the pricing structure for EC2 is pretty hard to
figure out (!!), we decided to get our toes wet (as opposed to
getting our feet wet) by experimenting with generation of the
FNEXRAD and UNIWISC products first.  This also had the beneficial
effect of moving some processing off of our motherlode server
and off of a very old machine that we have been running in the
UW/SSEC Data Center (this machine will be retired in the not
too distant future).

Since you mentioned it, operating a motherlode-clone in the
Amazon cloud would be very expensive -- an estimate using volumes
being served off of motherlode about a year ago was in the
neighborhood of $250K (!) per year.  The instance we are currently
working with should end up costing us between $3-3.6K per year.
The problem is you pretty much never know what something costs
until you get the bill.

We are investigating other clouds (MS Azure and Google) to see
if/where we can save money AND increase services.

One other thing: it appears that intra-cloud transfers of data
is very cheap.  This should mean that we could operate server
instances like motherlode in the cloud and provide data to other
machines running in the same cloud for little to no money (at
least for the time being).  This setup might work well if Unidata
community members moved their "stuff" to the cloud.

Cheers,

Tom
--
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: XAJ-729154
Department: Support Datastream
Priority: Normal
Status: Closed