Unidata Cloud Computing

Status Report: October 2014 - March 2015

Fisher, Arms, Caron, Ho, James, Schmidt, Weber, Yoksas, Chastang

Strategic Focus Areas

Unidata's Cloud Computing activities support the following Unidata funding proposal focus areas:

  1. Enable widespread, efficient access to geoscience data

    Making Unidata data streams available via various commercial and private cloud services will allow subscribers to those services to access data quickly and at low cost.

  2. Develop and provide open-source tools for effective use of geoscience data

    Running existing Unidata-developed and supported tools and processes (e.g. IDV, RAMADDA, generation of composite imagery) in a range of cloud environments makes these tools and data streams available to cloud service subscribers at low cost. It also gives us insight into how best to configure existing and new tools for most efficient use in these environments.

  3. Provide cyberinfrastructure leadership in data discovery, access, and use

    Unidata is uniquely positioned in our community to experiment with provision of both data and services in the cloud environment. Our efforts to determine the most efficient ways to make use of cloud resources will allow community members to forego at least some of the early, exploratory steps toward full use of cloud environments.

AWIPS II Cloud Servers

  • Unidata is testing small footprint EDEX servers (no NEXRAD Level 2 or 3 or high-resolution CONDUIT models) on both Microsoft Azure and Amazon EC2 cloud server environments.

  • An EC2 instance was created cooperatively by Unidata and Embry Riddle Aeronautical University (ERAU) as part of ERAU's equipment grant award. This instance, which is configured to run AWIPS II EDEX, has the following characteristics:

    • AWIPS II Size on Disk: 220 GB
    • Grids: 97 GB/day raw, 51 GB processed
    • NGRID: GFS 201, 212, 213, GFS/LAMPTstorm, MOSGuide, NAM 12km, NamDNG5, RTMA 5km and 2.5km, SREF 40km, GEFS. HiResW-NMM and HiResW-ARW, RAP 13km
    • CONDUIT: GFS global 1.0/2.5, NDFD, NAM 40km and 90km, RAP 20km and 40km, GFS 0.5 turned off
    • FSL2: HRRR (72 GB raw)
    • CMC: Regional GEM Model breaks grib2 decoder (turned off)
    • UNIWISC: 5 GB/day
    • FNEXRAD: DHR, DVL, EET, HHC, N0R, N1P, NTP
    • NEXRAD3, FNMOC: turned off

    This Azure instance is currently serving data to AWIPS II 14.2.1 beta testers:

IDV Application-Streaming Cloud Servers

This project is evaluating application streaming as a strategy for making the IDV available to a new generation of users and computing platforms. It is using the Microsoft Azure cloud platform to look at delivering cloud-based IDV-as-a-service instances to our user community on an as-needed basis. The result will be a better understanding of how the IDV works in cloud environments and any changes that might improve that performance.

This project also serves as a pilot program; with it we will further develop expertise related to cloud computing and application streaming. This will allow us to extend cloud-based software offerings beyond the IDV to other Unidata projects.

Issues

  • How does technology like Docker mitigate the need for multiple VM instances.
  • How best to adapt mouse-driven interfaces to a touch-based interface, while minimizing the need to re-engineer any part of the software package.
  • Evaluation of bandwidth requirements for acceptable IDV use.
  • How to make this transition seamless and painless to our user community.
  • Evaluate the extent to which we can use "off-the-shelf" technology and under what circumstances do we need to create our own protocols and packages.

Current Status

  • We are able to instantiate cloud-based IDV instances, which are then streamed via existing remote-desktop protocols to iOS devices. Nothing in the existing technology limits this to iOS devices, however; those are simply the devices on hand for testing.
  • Using the Azure Web API, we are able to dynamically allocate and provision VMs for use with hosting the IDV.
  • Current efforts are focused on creating a web dashboard which will allow users to register and manage IDV-streaming requests.

IDD Product Generation and Additional Experimentation

  • Unidata continues to operate mid-sized virtual machine instances in both the Amazon EC2 and Microsoft Azure west clouds for the purpose of generating image products for the IDD FNEXRAD (NEXRAD Level III national composites) and UNIWISC (GOES-East/West image sectors) data streams. The EC2 instance is currently the primary source of the FNEXRAD and UNIWISC data streams to IDD participants. The plan is to transition to the use of the Azure cloud instance to reduce recurring costs of running an instance in EC2. (Ward Fisher spearheaded a Unidata effort that resulted in Microsoft awarding use of 32 small VM instances in Azure free-of-charge for approximately 1 year).

  • Unidata implemented a TDS instance in the Azure west cloud for testing. This effort was put on hold pending renewal of Microsoft's grant of Azure resources.

  • A mid-sized VM instance in Azure is being used to investigate running the IDV in the cloud. RAMADDA has been installed and can generate non-interactive IDV displays using Xvfb for the needed XWindow environment.

Docker

  • Docker is a new cloud-centric technology that borrows from the notion of containers from the shipping industry to facilitate installation and deployment of server side applications in a cloud environment. We are investigating and exploring the possibility of creating Docker containers for cloud distribution and installation for a variety of Unidata technology offerings including RAMADDA, THREDDS, IDV, AWIPS II (EDEX/CAVE), and LDM.

  • We have reserved the Unidata namespace at DockerHub, and we have a prototype Docker image for RAMADDA and Unidata Python.

  • We are also educating ourselves on Docker technology and will have an internal Unidata talk on Docker, shortly.

Future Activities

The UPC is seeking User Committee input on two possible cloud experiments:
  • Investigate the feasibility of replicating the RAMADDA content on motherlode.ucar.edu in either the Amazon EC2 West or Microsoft Azure West clouds.

    It is thought that providing "cloud" access to a large portion of the RAMADDA content currently hosted on motherlode would help to mitigate impacts that would be experienced in the event of a catastrophic failure on motherlode. It is also thought that the serving of IDV bundles from the cloud would not result in large amounts of outbound network traffic from "the cloud", so costs would be modest.

  • Investigate replicating the decoded GEMPAK content currently available on all motherlode class machines (motherlode.ucar.edu, atm.ucar.edu, idd.ssec.wisc.edu, lead.unidata.ucar.edu and weather.rsmas.miami.edu) in "the cloud" and then encourage sites who are web scraping these data to establish their own presence in the same "cloud" and then grab the data from there. It would then be the end-user's responsibility to pay for the outbound network traffic from their own cloud instances.