Rosetta

Status Report: September 2014 - March 2015

Sean Arms, Jen Oxelson, Jeff Weber

Strategic Focus Areas

Community Services supports the following goals described in Unidata Strategic Plan:

  1. Enable widespread, efficient access to geoscience data
    The initial goal of Rosetta is to transform unstructured ASCII data files into the netCDF format; once in this format, standard tools, such as the THREDDS Data Server, IDV, Python, and other analysis packages, can take advantage of these datasets with relative ease.
  2. Develop and provide open-source tools for effective use of geoscience data
    Although the primary goal of Rosetta is to get data into the netCDF format, the transformation process does not stop there. The Rosetta group realizes that not everyone knows how to work with netCDF files, and may feel more comfortable working with other formats. Therefore, Rosetta includes the ability to transform from one format to another (e.g. netCDF to .xls), thereby reducing data friction.
  3. Provide cyberinfrastructure leadership in data discovery, access, and use
    Metadata contained in netCDF format file (no longer locked away in a separate README file) can be automatically extracted, facilitating the discovery of data in these files. Additionally, the Rosetta development plan includes the creation of a standard ASCII and spreadsheet representations of the CF-1.6 DSGs.
  4. Build, support, and advocate for the diverse geoscience community
    Promote the use of standard formats in the dissemination of data, while allowing flexibility to transform into other formats, as needed, to enable users to "do science". For commonly used formats, such as User Defined ASCII format or an unstructured spreadsheet, create and advocate for the use of a standard representations based on the CF-1.6 DSGs.

ACADIS Project

The ACADIS project is finishing its final year of funding. Current activities focus on NSF panel review recommendations, as well as extending the usefulness of Rosetta in context of arctic datasets. While the ACADIS project is winding down, development on Rosetta will continue into the future.

Activities Since the Last Status Report

 

Basic Documentation

Transitioned to using Doxygen for user and developer documentation.

  • Progress has been made on:
  • Dependencies, challenges, problems, and risks include:
    • Duplication of documentation effort is a risk, as ACADIS requests documentation specific to the project. However, most of these requests are centered around “branding”, and can easily be handled by Doxygen’s use of CSS.

Accomplishments of Note

  • Added the ability to publish converted files directly to RAMADDA and the ACADIS Gateway
  • Live instance of Rosetta hosted at Unidata for testing
  • Released the Rosetta source code on github
  • Started using Coverity static analysis on the Rosetta source code

Planned Activities

Ongoing Activities

While the ACADIS project is winding down, Unidata plans to continue on the following lines of development:

  • Increase the number of CF-1.6 discrete sampling geometries handled by Rosetta. For example, this will enable Rosetta to transform data from moving platforms and profiler data. A specific arctic related dataset would be an observation tower on a drifting iceberg.
  • Solicit examples from the community (hint, hint...that's you guys!)

New Activities

Over the next three months, we plan to organize or take part in the following:

  • Investigate csv and xls(x) representations of the CF-1.6 Discrete Sampling Geometries
  • Create infrastructure to collect use metrics for Rosetta
  • Transition from the Maven build system to Gradle
  • Continue documentation efforts, including the creation of screencasts for User documentation

Over the next twelve months, we plan to organize or take part in the following:

  • Enable Desktop (local) use of Rosetta
  • Incorporate TDS capabilities into Rosetta, allowing for TDS services (like point subsetting of grids) to easily be applied to local files

Beyond a one-year timeframe, we plan to organize or take part in the following:

  • We would like to extend Rosetta such that it can be used as a plugin for the THREDDS Data Server. One of the goals of this plugin would be to enable Rosetta to publish files into THREDDS Data Servers (TDS) as well as automatically generate the THREDDS Configure Catalogs needed to serve out the newly translated datasets.

Areas for Committee Feedback

Community Services is requesting your feedback on the following topics:

  1. We would love your input as to where our priorities should be in terms of these New Activities. Let's chat! And, yes, please...send example ASCII data

 

Relevant Metrics

We've received a handful of support questions regarding the availability of Rosetta, as well as requests for demonstrations.