"DRAFT"
Historical Data Access
Introduction
Access and application of archived data is an important aspect of teaching and
research activities in academic institutions throughout the nation. With limited
funds to support access to historical data sets, institutions have been soliciting
assistance from the UCAR/Unidata Program Center (UPC) to seek methods to coordinate
the disparate activities taking place within the geoscience community and to explore
reduced cost (preferrably free) Internet access to these data on behalf of academic
institutions. This issue was raised in the Unidata five-year proposal to the National
Science Foundation (NSF), by the reviewers, as an area that Unidata should be
investigating on behalf of the university community.
Methodology
Because the price of disk storage has been decreasing, many institutions have
created their own archives, based on their users' needs. The task of coordinating
these distributed community archives and providing information on the feasibility
of use for the broader Unidata university community will be explored by the
UPC. An incremental and iterative approach will be used to provide feedback
opportunities from interested community members. Status reports will be provided
to the UPC, Unidata Policy Committee, Unidata Users Committee, and other collaborators,
as progress is made.
Currently Existing
Following are some examples of the widespread data resources available through
various means today:
- universities and other groups, e.g. University of Washington,
Atmospheric Science, University of Illinois, Atmospheric Science,
UC-Santa Barbara/Alexandria Project,
Cooperative Program for Operational Meteorology, Education and Training COMET, Program for Advancement of Geoscience
Education PAGE , Research Applications
Program RAP are beginning to create
their own historical archives, based on their own users' needs
- government archive centers are customizing their methods of access in an
attempt to keep up with user needs, but are concerned about loss of funding
for their centers
- some sites are making select data sets available to members of their specialized
community, e.g., Distributed Oceanographic Data System
(DODS), Incorporated Research Institutions for Seismology (IRIS)
- Distributed Active Archive Centers (DAACs), an element of Earth Observing System
Data and Information System (EOSDIS) NASA, are providing earth observing
system data, e.g. Goddard DAAC
- meteorological case studies
are being provided through the WWW interface, with browsing and downloading
options, using the CODIAC Data Base Management System
- University of Wisconsin, Space Science & Engineering Center SSEC is offering satellite imagery for a
reduced price to university users and collecting user input toward creating
a service using the Abstract Data Distribution Environment
(ADDE) approach for McIDAS data files
- Unidata and NCAR's Scientific Computing Division (SCD) have begun collaborative discussions
to make data sets available from the Data Support Section (DSS) of SCD. In
May, 1999, a brief report
of activities and progress was provided for the Unidata Policy Committee.
Initial Steps
Preliminary steps can be taken to answer the needs of the Unidata community.
Some of these exist but need a more structured approach.
- Coordination of data archive center holdings and university data archive
activities by providing:
- an "archive" email list for interested community members to exchange
information
- a "needdata" email list was formed several years ago and has been
a popular method of requesting data from other participating community
members
- Web-based information on the data centers, including:
- links to data archive centers
- single URL for all archived datasets
- types of data provided with format, frequency, size of data holding
- fee structure associated with the data (free access will be explored)
- restrictions on data access and sharing
- Web-based information on the university data archives
- type of data available
- mechanism for making the data available
- restrictions of data access and sharing
Collaboration
Information is being gleaned from other methods of data access and distribution.
These methods will be evaluated, with the intention of establishing coherent
procedures and guidelines.
- Work with UCAR/NCAR Scientific Computing Division (SCD) to provide access
to the retrospective data archives maintained by NCAR/SCD for use in the research
and education community.
- define a data set of broad interest to the community through coordination
with the Unidata Users Committee
- needs to be evolutionary process
- surveys or similar methods may be considered
- coordinate with community members interested in seeking suitable methods
of access and distribution of archived data
- community suggestions
are being explored
- create orderly steps to make the data sets easily available
- learn through initial steps
- move to larger and robust data sets
- Refine methods of access and distribution for external projects, e.g. DODS,
Case Studies
- work with collaborators to enhance ease of use for case studies and
DODS
- user applications
- formats
- on-the-fly decoders
- data transfer methodology
- work on deployment of DODS servers (e.g. SCD/CGD Climate Simulation
Model), GRIB, etc.
- expand to meteorological community
- expand to other geoscience groups
- expand to other scientific groups using supported data formats
(e.g. netCDF, HDF, etc.)
Metadata
Following are some considerations to fold into the tasks, allowing for ease
of use for the diverse community.
- universal search engine, individual search engine, Web-based interface
for searching each archive
- coordinate easy searching for types of data
- metadata provided for each category
- information on how data are used when made available (with Unidata systems,
others?)
- how others can contribute to the collection
- based on the Unidata community model of making server available to
other community members
- metadata provided for data sets
- cataloging/locator servers coordination
- decoders - translators (usable for standard formats)
- if a good community model already exists....adopt it!
Summary
Technology is driving our future. "According to a recent survey conducted
by UCLA, 67 percent of professors are regularly stressed by keeping up with
emerging technology, compared with 62 percent stressed by teaching loads, and
50 percent stressed by research or publishing pressures. Researchers say fear
may be preventing professors from using new technology. Only 35 percent of professors
use the Internet for research purposes, while 38 percent use technology to prepare
presentations for classes. The survey results indicate that colleges should
work to improve instructors' computer skills in order to meet the needs of students
who have grown up using computers and are comfortable using new technologies."
(Associated Press 08/30/99)
Unidata and the Unidata university community have an opportunity to play an
important role in defining requirements for historical data needs for classroom,
laboratory and research environments today. Many sites are ahead of others and
have implemented FTP sites and other processes to make data available. A coordinated
approach among the community should be rewarding for a broader group of users
and lead toward a higher comfort level for professors, researchers and students
throughout the community.
Prepared by Linda Miller
External Liaison
Unidata Program Center
Boulder, Colorado
Questions or comments can be sent to Linda Miller
Last Modified: