Unidata - To provide the data services, tools, and cyberinfrastructure leadership that advance Earth system science, enhance educational opportunities, and broaden participation. Unidata
         
  advanced  
 

Global Community Tools Project

Draft  01/22/2000
John Caron

Statement of Purpose

The purpose of this project is to build software tools that support global on-line communities of people, who form around specific interests or tasks and who require sophisticated tool support in order to accomplish their goals.

Background

Arguably the most important long-range consequence of the emerging Internet infrastructure is the formation of new kinds of on-line, or virtual communities.  These virtual communities may differ from "real" communities in a number of important ways. Most obviously, virtual communities are freed from the limitations of geographic proximity and simultaneous presence. Because communication is computer-mediated, any number of software tools and affordances may be provided which allow forms of interactions and capabilities not possible in non-virtual groups.  The enabling technology for these tools is only a few years old, and so the possibilities of these virtual communities is historically new.

These are some of the charactoristics of virtual groups that are important for this project:

Broadly, virtual groups need tools that support information organization and retrieval, discussion and argumentation, and negotiation and decision. Some of this technology is ready for wide-scale deployment, and some is in the research stage. This projects aims to evaluate emerging technology in the context of the needs of on-line global communities, while simultaneously envisioning what kinds of communities are possible given the proposed tools. To accomplish this, we need equally the visions of the technologist, the sociologist and the community activist.

The first step is to create a framework and a process where people can share ideas and produce real applications to be tested and iteratively improved. We need to attract a critical mass of users and software developers who are committed to the project's continuing success and evolution.

Direction

While the above vision may provide long-term motivation, this project will be defined at first by its technical direction and initial applications. We intend to produce working software that solves existing problems for real people, and this goal will pull our vision into concrete form.

Technical Basics

We will organize ourselves as an open source software project, and the software will be freely available under the Gnu Public License. Rather than a formal design methodology, we will use the process of user feedback and iterative development to move in practical steps from software that supports existing practices towards new features and methods.

In order to provide cross-platform support, we will use Java, which supports the Windows, Linux, and Solaris operating systems, as well as MacOS X and other UNIX platforms, with a single source codebase.  Java is a modern, object-oriented language with very broad, freely available libraries, which provide support for such features as email access, networking, database, XML, GUI, help systems, web access, etc.  This allows us to concentrate on building our applications.

Target Group

The initial target group will be existing online technical support communities, or tech-groups for short.  These groups have the following characteristics: Online tech-groups have become the primary way for software companies to answer questions about their products.  In these cases, the company's tech-support personnel answer questions in the public forum provided by the tech-group, instead of, or in addition to volunteers. (See section 1 of [1] for more discussion of tech-groups).

There are surprisingly few tools that take advantage of the valuable information contained in the message streams of tech-groups. The current practice is to provide a Web-based or Usenet interface to the message archives, sorted by thread or date. Sometimes a keyword search facility is provided. A significant improvement would be to add better searching and organizing capabilities for these groups.

The advantage of this approach are:

New Methods of Information Retrieval

Keyword-based retrieval requires a user to guess what words or phrases are likely to be in the documents they are searching for. Users who are unfamiliar with the domain vocabulary can find it difficult to find information that they need. A promising alternative is Latent Semantic Analysis (LSA), a type of vector-space information retrieval.  LSA uses a mathematical algorithm called singular value decomposition to compute statistical correlations between words in a document set. Queries can be matched to relevent documents even when the document and the query have no words in common. Preliminary tests have found this approach to be significantly better than keyword searches in matching natural language questions to previous questions and answers in tech-group archives [1].

There is much current research being done in information retrieval, document categorization, user interface, information display, etc.  It may be possible that our project can provide an environment for testing new research.

While Artificial Intelligence (AI) techniques may eventually provide important capabilities, our emphasis in this project will be tools that assist humans, sometimes called Intelligence Augmentation (IA). This kind of software keeps humans "in the loop", with an emphasis on the efficient use of people's time.
 

Initial Application

The first application is provisionally called the Frequently Asked Question Organizer (FAQO) (other suggestions welcome). Alpha versions of this software exist and can be downloaded from here. The following is an overview of the initial beta release, scheduled for June 2001.

The most prominant feature is the searching of message archives using LSA. This feature should provide the additional benefit over existing practices to offset the cost of installing and learning a new application. The other important feature will be the creation and maintenance of the message archives. The initial release will consist of a server and three client applications.  The clients are all pure Java implementations, while the server has some native code. Each application supports a specific role that a user might play.

FAQuery  This is a simplified version for people who want to just ask a question. Its GUI must be simple and intuitive; it probably needs an implementation that runs in a web browser.

FAQOTech  This is for "tech-support" personell who answer questions. Connects to an IMAP server to get incoming messages, and allows user to efficiently query message archives and respond with an email. Makes suggested changes to the database.  Any number of people can play this role simultaneously.

FAQOwner  This is for the owner of the database, who decides what messages are saved in it, and approves any changes to it. Only one person per database has this role, but in the future we will add support for multiple owners in a relationship of trust.

Doubts

References and Further Reading

Journal of Computer-Mediated Communication

Peter Kollock and Marc Smith, Managing the Virtual Commons: Cooperation and Conflict in Computer Communities. Pp. 109-128 in Computer-Mediated Communication: Linguistic, Social, and Cross-Cultural Perspectives, Amsterdam: John Benjamins,1996

Previous papers

John Caron, Experiments with LSA Scoring: Optimal Rank and Basis, SIAM Computational IR Workshop. October 2000.

[1] Applying LSA to Online Customer Support: A Trial Study. John Caron, Unpublished Master's Thesis. May 2000.

John Caron, Design for the FAQ Organizer Application, Dec 1999.

John Caron, Wide Area Collaboration: A Proposed Application. April 1998.

Other Question/Answering systems

Ackerman, Mark S. Augmenting the Organizational Memory: A Field Study of Answer Garden. CSCW'94

Ackerman, Mark S. and McDonald, David W., Answer Garden 2: Merging Organizational Memory with Collaborative Help CSCW'96

Budzik, J. and K. J. Hammond (1999). Q&A: A System for the Capture, Organization and Reuse of Expertise, Proceedings of the Sixty-second Annual Meeting of the American Society for Information Science, Information Today, Inc., Medford, NJ, 1999.
 
 
 
 
 
 
  Contact Us     Site Map     Search     Terms and Conditions     Privacy Policy     Participation Policy
 
National Science Foundation (NSF) UCAR Community Programs   Unidata is a member of the UCAR Community Programs, is managed by the University Corporation for Atmospheric Research, and is sponsored by the National Science Foundation.
P.O. Box 3000     Boulder, CO 80307-3000 USA     Tel: 303-497-8643     Fax: 303-497-8690