Unidata Outreach Accomplishments and Challenges

Ben Domenico, October 2011

Relationship to Unidata 2013 Proposal

This work relates to several of the proposal goals: 1. Broadening participation and expanding community services; 2. Advancing data services
3. Developing and deploying useful tools; 5. Providing leadership in cyberinfrastructure. 

As noted in the two following sections,  the work was called out specifically in an interaction with the review panel and in the review panel summary.

Review panel question and UPC response

1e. Is the UPC prepared to provide the same quality of support to the newly engaged communities as it provides to its current constituents?

While the support for all users will remain at a very high level, that does not mean it will be exactly the same.   For example, for the core community Unidata provides comprehensive support for a full suite of tools from data services, through decoders, to complete analysis and display packages.  For  other cases, the tools that are specialized to their community may not be available via and supported by the UPC.  One example of this is the community of users of GIS tools.  In that case Unidata supports standards-based web services that make our datasets available in such a way that tools that incorporate those standard interfaces can avail themselves of  Unidata datasets.  Thus these new communities can continue to make use of the analysis and display tools they are familiar with while taking advantage of the data services of the traditional Unidata community. 

Excerpt from the proposal review panel report

Advocacy for Community Standards:  "In particular, the UPC could play a significant leadership role within committees and consortiums like OGC seeking to address the need to develop standards and technologies for data discovery. Unidata leadership and advocacy in this area could facilitate expanded utilization of Unidata information resources for other research areas like climate and provide Unidata users with easier access to other data sources like NASA satellite information. However, the OGC letter of recommendation in the proposal and the Unidata responses to the review panel questions regarding cyberinfrastructure did demonstrate that the Unidata was actively involved in community discussion of interface and data standards."

Summary of Recent Progress

Background on netCDF and CF formal standards efforts

Following on the success of Russ Rew and the netCDF team in establishing netCDF and CF as NASA standards, efforts continue to have CF-netCDF recognized internationally by the  Opengeospatial Consortium (OGC) as standards for encoding georeferenced data in binary form.

As the official UCAR representative to the OGC Technical Committee, Unidata participates in 3-4 technical committee meetings per year to ensure that Unidata and UCAR needs are met in the emerging international standards.

The overall plan and status is maintainted at  http://sites.google.com/site/galeonteam/Home/plan-for-cf-netcdf-encoding-standard.  In keeping with the proposal and review panel recommendations, the goal of this effort is to encourage broader use of Unidata's data by fostering greater interoperability among clients and servers interchanging data in binary form.  Establishing CF-netCDF as an OGC standard for binary encoding will make it possible to incorporate standard delivery of data in binary form via several OGC protocols, e.g., Web Coverage Service (WCS), Web Feature Service (WFS), and Sensor Observation Service (SOS).  For over a year, the OGC WCS SWG is already developing an extension to the core WCS for delivery of data encoded in CF-netCDF.  This independent CF-netCDF standards effort is complementary to that in WCS and hopefully will facilitate similar extensions for other standard protocols.

Progress on OGC standardization

In January 2011, the OGC Technical Committee voted to adopt the netCDF Classic as an official OGC binary encoding standard.  As of the writing of this report, the final standard specifications are being formatted for final publications, but the draft standards are still available in three documents: an overview primer, the core standard spec, and the binary encoding spec.    The standards documents are available at

http://www.opengeospatial.org/standards/netcdf

In addition, extension specifications for the netCDF core standard have been drafted for the netCDF enhanced (netCDF4) data model, the CF conventions and for the CF-netCDF extension to the OGC Web Coverage Service (WCS).  These draft documents are available at

https://portal.opengeospatial.org/index.php?m=projects&a=view&project_id=82&tab=2&artifact_id=45016

The week of September 19, Unidata hosted the meetings of the OGC Technical Committee and Global Earth Observation System of Systems (GEOSS) meetings.   The participation (~360) at these meetings exceeded the previous record by nearly 50%.  Unidata's efforts continued with presentations on the enhanced data model and CF conventions extensions to the core netCDF data model standard.   In addition our experimentation with the use of web brokering services (namely, GI-cat from the U of Florence ESSI Labs) infrastructure to establish data search capabilities for THREDDS server was highlighted in the Met/Ocean/Hydro water cycle summit.

Ongoing Outreach Activities

AccessData (formerly DLESE Data Services) Workshops 

The overall AccessData program is described at:  http://serc.carleton.edu/usingdata/accessdata/ and the most recent workshop page is: http://serc.carleton.edu/usingdata/accessdata/impacts/index.html.   The AccessData team is now working on several publications to document the results of the project.

One of the resulting publications was the winner of a AAAS Science Prize for Online Resources in Education (SPORE).   The essay "Making Earth Science Data Accessible and Usable in Eduction" is available in the online version of Science at:

http://www.sciencemag.org/content/333/6051/1838.full.pdf

Data Discovery Initiatives

In keeping with the Unidata 2013 Proposal review panel recommendation relating to collaborating with others to enhance the available data discovery facilities, the UPC and the Unidata community are following up on earlier collaborations with George Mason University and NASA.  The most recent work is with the U of Florence ESSI Labs team to use their tools to harvest search metadata from THREDDS data servers which can provide special challenges because of the size and volatility of their holdings. A new release of the ESSI Labs GI-cat package has addressed limitation of earlier versions which ran into difficulty with the Unidata Motherlode THREDDS server.  Members of our community are finding this tool useful enough that Rich Signell has created a YouTube video on "How to Configure GI-CAT for the first time": http://youtu.be/28biJHTQSrM.   This work was described in an invited paper with David Maidment as the lead author at the Fall  2010 AGU and another presentation on the topic was given at the Water Cycle Summit held in conjunction with the OGC Technical Committee meetings at UCAR.   This work continued through the summer with a visiting graduate student, Avirup Gupta from Utah State University.    There is also a group coalescing to propose web brokering services such as GI-cat as a part of the NSF EarthCube.

Other Collaborations:

  • NCAR GIS Program
  • Marine Metadata Interoperability (MMI) Project Steering Team
  • IOOS DMAC Steering Team
  • CUAHSI Standing Committee
  • UCAR wide representative to OGC Technical Committee
  • AGU ESSI Focus Group Board
  • ESIN Journal Editorial Board
  • Host for OGC Technical Committee Meeting September 2011
  • Liaison to OOI Cyberinfrastructure Project
  • Several possible collaborations with EarthCube teams
  • Possible collaboration with European Commission team on proposal with NSF funding for Unidata

Planned Activities

The next steps in the CF-netCDF standardization effort were noted above.  This will be coordinated with OPeNDAP, Inc who recently joined the OGC as a voting member of the Technical Committee and with the HDF group who sent two participants to the September OGC meetings but have not yet formally joined the OG.

After the last policy committee meeting, I created a white paper based on my "Data Interactive Publications" presentation which seemed to be well received.  It's available at

https://sites.google.com/site/datainteractivepublications/home/white-paper-on-data-interactive-publications

along with several simple examples.  This may form the basis of a white paper for EarthCube submitted in conjunction with the OGC.  So far, interest in the concept in the EarthCube community has come from:
  • Amy Apon, Chair of Computer Science Division at Clemson University
  • Ian Foster, Distinguished Fellow at Argonne, "father of the grid."
  • Siri Jodha Khalsa, Research Scientist at NSIDC
  • Erik Franklin, Hawaii Institute of Marine Biology

Relevant Metrics

  • One co-authored essay in Science
  • The list of "other collaborations above includes a dozen organizations we have regular interactions with.  In most cases, our interactions are as representatives of our community on their steering or policy groups, so we have at least some voice in their direction.
  • Over the years of these standardization efforts, ESRI has incorporated the netCDF among the input and output formats that their arcGIS tools work with directly.  This represents a user community that numbers in the millions, but it isn't possible for us to measure how many of those users now use it to access our data.
  • The standards efforts enable us to collaborate on an ongoing basis with dozens of international organizations -- especially those represented in the OGC MetOceans, Earth System Science, and Hydrology Domain Working Groups.