Unidata Outreach Accomplishments and Challenges

Ben Domenico, September 2013

Relationship to Current Unidata Strategic Plan

Below are a few excerpts from the current Unidata Strategic Plan that highlight the importance of the outreach activities summarized in this status update?

  • ... to build infrastructure that makes it easy to integrate and use data from disparate geoscience disciplines

  • Data formats like netCDF, together with community-based data standards like the Climate and Forecast metadata convention and the Common Data Model are enhancing the widespread usability and interoperability of scientific datasets.

  • Advance geoscience data and metadata standards and conventions

  • ... close partnerships and collaboration with geoscience data providers, tool developers, and other stakeholders,

  • ... our experience shows us that robust solutions arise from community and collaborative efforts

  • ... close partnerships and collaboration with geoscience data providers, tool developers, and other stakeholders, and the informed guidance of our governing committees will all be important catalysts for Unidata’s success.

Relationship to Unidata 2013 Proposal

This work relates to several of the proposal goals: 1. Broadening participation and expanding community services; 2. Advancing data services
3. Developing and deploying useful tools; 5. Providing leadership in cyberinfrastructure. 

As noted in the two following sections,  the work was called out specifically in an interaction with the review panel and in the review panel summary.

Review panel question and UPC response

1e. Is the UPC prepared to provide the same quality of support to the newly engaged communities as it provides to its current constituents?

While the support for all users will remain at a very high level, that does not mean it will be exactly the same.   For example, for the core community Unidata provides comprehensive support for a full suite of tools from data services, through decoders, to complete analysis and display packages.  For  other cases, the tools that are specialized to their community may not be available via and supported by the UPC.  One example of this is the community of users of GIS tools.  In that case Unidata supports standards-based web services that make our datasets available in such a way that tools that incorporate those standard interfaces can avail themselves of  Unidata datasets.  Thus these new communities can continue to make use of the analysis and display tools they are familiar with while taking advantage of the data services of the traditional Unidata community. 

Excerpt from the proposal review panel report

Advocacy for Community Standards:  "In particular, the UPC could play a significant leadership role within committees and consortiums like OGC seeking to address the need to develop standards and technologies for data discovery. Unidata leadership and advocacy in this area could facilitate expanded utilization of Unidata information resources for other research areas like climate and provide Unidata users with easier access to other data sources like NASA satellite information. However, the OGC letter of recommendation in the proposal and the Unidata responses to the review panel questions regarding cyberinfrastructure did demonstrate that the Unidata was actively involved in community discussion of interface and data standards."

Summary of Recent Progress

Cloud-based Collaborative Python Development

For the first time, Unidata included a workshop session on software development using Python.   The workshop had identical development environments set up on each workstation in the lab so that participants could work with the same set of Ipython notebooks illustrating how Python code can be used to access, analyze and display Unidata data.  One additional experiment used a cloud-based Python environment called Wakari that is available from Continuum Analytics.    Most of the workshop notebooks worked immediately in the Wakari environment.   For a few notebooks, minor editing was needed and, in one case, an additional library had to be loaded.  The advantage of this approach is that one does not have to configure the same working environment on each local computer on which one does development.   The development environment is available in the cloud and can be accessed from a browser on any workstation.

Subsequent attempts to share the Wakari environments with a collaborator were not completely successful.   It turns out that different users can clone a Wakari environment but that results in two separate copies of the environment rather than two collaborators sharing the same environment.    The Wakari support team indicates that a new release of the system will include such a shared collaborative project environment but it will not be available for several months.   In the meantime, the plan is to see if we can just share an account as a way to get started.

Progress on OGC standardization of CF-netCDF

As the official UCAR representative to the OGC Technical Committee, Unidata participates in 3-4 technical committee meetings per year to ensure that Unidata and UCAR needs are met in the emerging international standards.

In 2011, the netCDF Classic data model was established as the OGC core netCDF standard.   The binary encoding for the classic data model was established as the first extension to the netCDF core standard.   Since the last Policy Committee report, the netCDF enhanced data model and the CF (Climate and Forecast) conventions have been formally adopted as extensions
to the netCDF core standard.    The OGC-adopted standards documents are available at

http://www.opengeospatial.org/standards/netcdf

This completes the primary objectives we had laid out for the CF-netCDF standards initiative in the OGC.   However, the CF-netCDF Standards Working Group (SWG) is also considering ncML (netCDF Markup Language) as an XML encoding format for netCDF.  In addition, a new initiative for encoding uncertainty information has been formally adopted as an OGC Discussion Paper.

http://www.opengeospatial.org/node/1778
An OGC Technical Committee meeting is scheduled for the week of September 23.  There will be a CF-netCDF SWG session on Wednesday.  Currently the agenda includes:
  • Status update on specification to establish CF-netCDF as an encoding format in OWS Common (Stefano Nativi)
  • Update on netCDF Uncertainty Conventions discussion paper (Lorenzo Bigagli)
  • Report on Prod-Trees initiative (Paolo Mazzetti) 

Data Access Protocol Issues

At recent OGC Technical Committee meetings, the Coverages DWG (Domain Working Group), the WCS (Web Coverage Service) SWG (Standards Working Group), and the CF-netCDF SWG have taken up the question of how to incorporate coverage encodings (e.g., geoTIFF, JPEG2000, netCDF) into OGC protocol specs (not just WCS but also possibly WFS, SOS, WPS, ...).   There is general agreement that these coverage encoding specifications (e.g., the encoding data model mappings to GMLCOV and the special parameters for each binary encoding) should be decoupled from the data access protocols.   So this is a departure from our original proposal for the CF-netCDF specifically as an encoding for WCS 2.0.

In the previous of these reports, it was noted that the OGC Architecture Board is also considering ways to streamline and simplify some of the rather rigid requirements for how the specification documents are written.  Some possible mechanisms are aiming for breaking specs into fewer modules, providing a cleaner and less distracting means for dealing with HTTP URI requirements of the OGC Naming Authority, and perhaps less emphasis and dependence on UML diagrams.   Having spent a large fraction of time writing and rewriting those portions of the existing CF-netCDF documents, these are definitely moves in the right direction.

OGC Standards Actions

  • Enhanced (netCDF4) data model adopted as OGC extension standard to netCDF core.
  • CF conventions adopted as OGC extension standard to netCDF core.
  • OGC discussion initiated on best mechanism for connecting CF-netCDF encoding to various OGC service protocols
  • CF-netCDF encoding for OGC Web Services Common (OWScommon) has been drafted
  • Dicusson Paper published on Uncertainty Conventions for netCDF
  • OPenDAP access protocol needs to be coordinated
  • HDF encoding needs to be coordinated
  • Australian Bureau of Meteorology is adopting CF-netCDF as standard for climate data

Additional Outreach Activities

Outstanding JGE Paper Award for Past Efforts

Earlier work of many in the Unidata community on the NSF Funded AccessData project was recognized with the award for the outstanding publication of the Journal of Geophysical Education (JGE).   The paper is "A Model for Enabling an Effective Outcome-Oriented Communication Between the Scientific and Educational Communities" by Tamara Shapiro Ledley, Michael R. Taber, Susan Lynds, Ben Domenico, and LuAnn Dahlman.

http://nagt-jge.org/doi/pdf/10.5408/11-234.1

Jeff Weber, the NCAR GIS Project Team and many members of the Unidata community participated in the workshops that provided the foundation for the paper.  One of the review committee members commented that the paper "reports an innovative approach to curriculum development, which has been developed and evaluated over a protracted timescale, and which has yielded some excellent outcomes in terms of educational resources. It also addresses the critical issue of knowledge transfer between scientists and educators, and the increased need to incorporate communication and outreach into funding bids. The workshop approach described in the paper is relevant outside of the US. The paper contains sufficient detail for the model to be developed and implemented anywhere in the world, and use of the      web for dissemination means that resources can be freely accessed. The potential for advancement of geoscience education is therefore significant.  There is also clear potential for societal impact. Collaboration between educators and scientists is critical to designing resources which are accessible to a broad range of learners, while engagement with large scientific organizations to gain access to data is a powerful means of linking science and society."

Active and Ongoing Collaborations:

  • NCAR GIS Program
  • Collaboration with ESSI Labs to experiment with their brokering layer in conjunction with THREDDS Data Servers
  • UCAR wide representative to OGC Technical Committee
  • Australian Navy THREDDS Use

Relatively New Emerging Collaborations

  • ESSI Labs collaboration on cloud-based client and server approaches
  • Collaborative European / US / Australian effort on the Ocean Data Interoperability Platform (ODIP)
  • Australian Bureau of Meteorology Climate Data
  • Google Earth Engine
  • Wakari Cloud-based Collaborative Python Development Environment

Areas to Consider Reduced Commitment or Transferring Responsibility

  • Marine Metadata Interoperability (MMI) Project Steering Team
  • NOAA Climate Prediction and Projection Pilot Platform (NCPP)
  • CUAHSI Standing Committee
  • AGU ESSI Focus Group Board
  • ESIN Journal Editorial Board
  • Liaison to OOI Cyberinfrastructure Project
  • Collaborations with EarthCube teams
  • Potential collaboration with SDSC team on annotating datasets with information gained from support archives
  • U of Texas EarthCube Building Blocks project

The ODIP (Ocean Data Interoperability Platform) was funded by the European Commission and we continue to work with San Diego Supercomputing Center and Woods Hole to get the US part of the project funded by NSF.  Unidata's technologies (especially THREDDS and netCDF) are part of the project and we also maintain a liaison role to make out community aware of the work an possible applications.  Unidata participated in the initial workshop in February.

http://www.odip.org/content/news_details.asp?menu=0100000_000001

http://seadatanet.maris2.nl/newsletter.asp#70

Planned Activities

ODIP

The ODIP (Ocean Data Interoperability Platform) was funded by the European Commission and we continue to work with San Diego Supercomputing Center and Woods Hole to get the US part of the project funded by NSF.  Unidata's technologies (especially THREDDS and netCDF) are part of the project and we also maintain a liaison role to make out community aware of the work an possible applications.  Unidata participated in the initial workshop in February.

http://www.odip.org/content/news_details.asp?menu=0100000_000001
http://seadatanet.maris2.nl/newsletter.asp#70

There will be a special ODIP session at the IMDIS conference September 23 and the second ODIP Workshop is scheduled for December the week before the AGU.

Cloud-based Development

The new initiative to experiment with cloud-based (initially using Wakari) development environments will be a main focus in the near future, working in collaboration with the NCAR GIS Project and with ESSI Labs. This will be a discussion topic during the visit to ESSI Labs later this month.

Ongoing OGC Standards Work

For CF-netCDF standardization, the main objective is to establish CF-netCDF as a standard-encoding for all OGC web service data access specifications (WMS, WFS, WCS, SOS) -- hopefully without having to generate a different spec for each service.  However, it should be noted that, with CF-netCDF established as an international OGC encoding standard, the primary objectives have been accomplished.   The discussion paper on netCDF conventions for encapsulating uncertainty information has been approved and is under active discussion whose outcome will determine whether this will eventually be proposed as an additional extension to the netCDF core standard.   Work is likely to accelerate on collaborations with OPeNDAP and HDF who are now active in the OGC.   An approach for dealing with the HDF5 encoding of the netCDF enhanced data model is still being sought.

The effort to establish CF-netCDF as an encoding format for WCS (as well as WFS, WPS, and SOS in the long term) has led to an OGC discussion about a more general mechanism for establishing encoding formats for multiple OGC data access protocols.   Unidata is participating actively in these developments.


Unidata will chair the CF-netCDF SWG at the OGC TC meeting later this month.

Relevant Metrics

  • Outstanding publication in the Journal of Geophysical Education (JGE):  "A Model for Enabling an Effective Outcome-Oriented Communication Between the Scientific and Educational Communities" by Tamara Shapiro Ledley, Michael R. Taber, Susan Lynds, Ben Domenico, and LuAnn Dahlman. .  
  • Two netCDF-related OGC international standards (netCDF 4 data model and CF conventions) The list of "other collaborations" above includes a dozen organizations we have regular interactions with.  In most cases, our interactions are as representatives of our community on their steering or policy groups, so we have at least some voice in their direction.
  • One additional international collaboration (Australian Bureau of Meteorology work on CF-netCDF for their climate data)
  • One new potential collaboration with industry (Wakari).
  • Over the recent years of these standardization efforts, ESRI has incorporated the netCDF among the input and output formats that their arcGIS tools work with directly.  This represents a user community that numbers in the millions, but it isn't possible for us to measure how many of those users now use it to access our data.
  • The standards efforts enable us to collaborate on an ongoing basis with dozens of international organizations -- especially those represented in the OGC MetOceans, Earth System Science, and Hydrology Domain Working Groups.