Python

Status Report: October 2014 - March 2015

Ryan May, Sean Arms, Julien Chastang, Ward Fisher, Russ Rew, Ben Domenico

Strategic Focus Areas

Python activity at Unidata supports the Unidata strategic goals in the following ways:

  1. Enable widespread, efficient access to geoscience data. Python can facilitate data-proximate computations and analyses through IPython (now Jupyter) Notebook technology. In particular, IPython Notebook web servers can be co-located to the data source for analysis and visualization through web browsers. This capability in turn, reduces the amount of data that must travel across computing networks. There are also external providers such as Wakari and coLaboratory that help to promote the use of this technology as a cloud service.
  2. Develop and provide open-source tools for effective use of geoscience data. Our current and forthcoming efforts in the Python arena will facilitate analysis of geoscience data. This goal will be achieved by continuing to develop Python APIs tailored to Unidata technologies. For the summer 2013 Unidata training workshop, we developed an API to facilitate data access from a THREDDS data server. This effort was later encapsulated with the new pyUDL (a collection of Python utilities for interacting with Unidata technologies) project. Moreover, Python technology coupled with HTML5 IPython Notebook technology has the potential to address "very large datasets" problems. In particular, an IPython Notebook can be theoretically co-located to the data source and accessed via a web browser thereby allowing geoscience professionals to analyze data where the data reside without having to move large amounts of information across networks. This concept fits nicely with the "Unidata in the cloud" vision. Lastly, as a general purpose programming language, Python has the capability to analyze and visualize diverse data in one environment through numerous, well-maintained open-source APIs.
  3. Provide cyberinfrastructure leadership in data discovery, access, and use. The TDS catalog crawling capabilities found in pyUDL will facilitate access to data remotely served by the Unidata TDS, as well as other TDS instances around the world. The desired goal of pyCDM is to construct a geoscience focused data model in Python, based heavily on the netCDF-Java implementation of the Common Data Model (CDM). pyCDM is anticipated to provide a simple, pythonic API to the higher level functionality of the FeatureType layer of the CDM.
  4. Build, support, and advocate for the diverse geoscience community. Based on grassroots interest from the geoscience community, Unidata, as part of its annual training workshop, will host a three day session to explore “Python with Unidata technology”. Also, to try to help the use of NetCDF in Python, Unidata has promoted Jeff Whittaker’s NetCDF4-python project, including hosting its repository under Unidata’s GitHub account.

Activities Since the Last Status Report

  • Users’ Workshop
    • 21 people attended the Python Users’ workshop in October 2014
    • Takeaways:
      • Cartopy worked out well
      • Anaconda worked out very well
  • CDMRemote
    • Prototyped Python implementation of THREDDS’ CDMRemote protocol to provide OPeNDAP-like access to data
    • Pure Python implementation facilitates remote data access from cloud-based services using Wakari and/or the IPython Notebook
    • Prototype sets stage for additional development of protocol to expose more details and semantic information available from the CDM in netCDF-java
  • WAVE
    • Tech demonstration of using client-side javascript and WebGL in the browser to talk to IPython notebook-based server
    • Explores cloud-based visualization of data in the web browser using open standards
    • Utilizes existing Python base for data access (from THREDDS server) and analysis
    • Presented at 2015 AMS Annual Meeting in Python symposium
  • Matplotlib support
    • Worked variety of support issues with Matplotlib’s animation support, which is popular with the community
  • Unidata Python on Docker
  • 2015 Unidata Summer Python Training Workshop
  • Discussion with UKMET
    • John Caron met with the team from the UKMet office responsible for development of the IRIS package. They seem very keen for collaboration and an invitation was extended for some developers to come here after the 2015 SciPy conference.
    • Risk: Any collaborative activities will be band-limited by staffing resources available to commit to Python development

Planned Activities

Ongoing Activities

We plan to continue the following activies:

  • Continue planning for the Unidata Summer Python Training Workshop
  • As time allows, continue to explore WebGL + IPython Notebook + THREDDS access within WAVE
  • IPython Notebook
    • Install IPython Notebook on server to explore data-proximate analysis in Python
    • Looking at IPython community solutions for multi-user management and authentication: tmpnb and Project JuPyteR Hub
  • CDMRemote
    • Continue getting all corners of CDMRemote protocol implemented
    • Currently pursuing funding to implement CDM in Python (PyCDM), including support for CDMRemoteFeature
  • Continue relevant Matplotlib support

New Activities

  • Actively working on contribution to matplotlib to facilitate its use for making station plots--this fills a large missing piece in using matplotlib for day-to-day meteorological analysis
  • Open up MetPy as a useful landing point for some Python work: Skew-T, NEXRAD Reader, full station-plots

Areas for Committee Feedback

The Python group is requesting your feedback on the following topics:

  1. What are the biggest obstacles that you see to the use of Python with other Unidata technologies, or for use in meteorology in general?
  2. Given the limited staffing resources for Python activities, how would you order in priority the various on-going and new activities listed above?

 

Relevant Metrics

  • Matplotlib Animation Support
    • 3 Pull Requests reviewed and committed
    • 4 Issues supported and closed
  • NetCDF4-Python
    • 36 Issues opened (28 closed)
    • 22 Pull Requests (all closed)
    • 99% Jeff Whittaker (and Stephan Hoyer)