Unidata - To provide the data services, tools, and cyberinfrastructure leadership that advance Earth system science, enhance educational opportunities, and broaden participation. Unidata
         
  advanced  
 

 

COMPONENT BASED SOFTWARE FOR SCIENTIFIC APPLICATION DEVELOPMENT

John Caron
Unidata/UCAR, PO Box 3000, Boulder, CO 80307-3000
caron@ucar.edu


15th International Conference on Interactive Information and Processing Systems for Meteorology, Oceanography, and Hydrology, Jan. 1999, American Meteorological Society

1. ABSTRACT

Component-based software is the latest buzzword in the quest for software reuse. Unlike object-oriented software engineering that accomplishes reuse through inheritance of implementation code, component based software engineering focuses on standardizing the interfaces between independently developed units of software. One goal of such an approach is application development using commercial off-the-shelf components, analogous to hardware assembly using integrated circuits as building blocks. Another is to use interface-based design to partition software development into abstract components.

This paper reviews terminology and concepts of component-based software engineering. It is argued that JavaBean components with open source provide a way of packaging software for reuse among cooperating scientific application developers. Some preliminary conclusions are given based on Unidata’s experience using components in a meteorological data visualization prototype.

2. SOFTWARE COMPONENTS

Software components are units of software designed to interact with other independently developed components, and to be assembled by third parties into applications. (See [Szyperski 98] for a complete introduction). Component-based software engineering (CBSE) is an evolution of object oriented software engineering (OOSE). While both share the goal of software reusability, OOSE is an implementation methodology, while CBSE is an interface methodology. In CBSE the emphasis is on standardizing the interfaces between components, with no restrictions on implementation. CBSE therefore is closely related to module design in focusing on the separation of interface and implementation.

In OOSE, code reuse is accomplished through inheritance of implementation code. While OO languages also generally allow separation of implementation and interface, in practice the desire for code reuse often complicates data typing, and "inheritance breaks data encapsulation" (Snyder86). Partly for these reasons and partly due to the extra level of effort to prevent unnecessary dependencies, class hierarchies tend to be reused only within an application. Of course, class libraries, like procedural libraries, that are designed to be incorporated into larger applications are good examples of code reuse, although that reuse may have little to do with object orientation.

2.1 Component Frameworks

Components require a context within which to operate and interact. While terminology is not yet standard, we can use the term component architecture to mean the conceptual model of how components are defined and how they interoperate, while the term component infrastructure is the implementation of the architecture [Ben-Shaul 98]. We will use the term component environment to cover both aspects (Szyperski uses the term component world). There are currently three major component environments: The Object Management Group’s (OMG) CORBA, Microsoft’s COM, and Sun’s JavaBeans.

The services that a component needs from its environment are called its context dependencies. The mechanisms by which components interact are called wiring standards (or often plumbing), and are one of the most important services provided by a component environment.

A component describes itself through its contractual interface. The contract has two parts: the contract syntax specifies the data types of the interface method’s arguments, which can be enforced automatically at compile or run time. The contract semantics describe the behavior of the component. Since semantics are always context specific, standardization of component behavior is specific to an application domain and are called domain standards. Within a specific domain where the range of possible meanings can be limited, domain semantic models can be defined that allow semantic contracts to be (at least partially) automatically enforced. An example is OMG’s business object component architecture (BOCA) which uses the Component Definition Language (CDL) to describe component contracts with domain-specific concepts [Digre98].

The combination of domain standards and supporting infrastructure that facilitate independent component development is called a component framework, the most familiar example being GUI application builders. A component framework must specify both wiring and domain standards for the components it provides, that is, both syntactic and semantic descriptions of its interfaces. In practice, a component framework is an implementation of a domain standard and so both evolves simultaneously.

2.2 JavaBeans

JavaBeans is a set of conventions and supporting classes in Java to allow component-based software development. JavaBeans is a Java-only component framework, although there are ways to bridge to the COM and CORBA component environments.

Components must be developed independently yet still interoperate, so components cannot know about each other at compile time, but must be able to discover and communicate with each other at run time. The core of JavaBeans is two simple conventions that allow this behavior. First, JavaBean components (called beans) "advertise" their public properties through get/set methods that follow certain syntactic conventions. Supporting JavaBean classes provide introspection services that allow other beans to discover those properties at runtime. Second, beans communicate by sending typed events that encapsulate information that the outside world needs to know. Other beans use introspection to discover what events a bean can throw, and then can register as listeners for those events.

An application builder connects components during an assembly phase. Three distinct roles emerge with respect to components: the compile-time developer, the assembly-time assembler, and the run-time user. The distinction between developer and assembler is a key factor in reusability. Commercial JavaBean assembly and development tools use introspection to present bean properties and events to the assembler, enabling visual (codeless) programming.

2.3 COTS vs. Abstract Components

Recently, component software development has started to take two differing emphases, depending on whether components are seen as commercial off-the-shelf (COTS) building blocks that enable application assembly, or whether components are seen as a design method for system partitioning [Brown 98]. COTS components imply black-box application assembly in which the source code is not usually available. The downside of this approach are the ambiguities in interface semantics, dependency on third parties, legal issues, and other "not invented here" concerns. The essential difficulty is precisely describing the semantics of the component, while allowing enough flexibility to customize the component for a particular need.

Components are also used as a design methodology for creating software systems. By concentrating on component-based, course-grained interfaces, a system’s architecture is partitioned into independent subsystems, with context dependencies explicitly documented. Specific implementations can then be modified or substituted at will. In this design role components are called abstract components. This decomposition early in the design phase facilitates parallel implementation and reuse.

3. SCIENTIFIC APPLICATION DEVELOPMENT

The state-of-the-art in software development is large, monolithic, platform-dependent applications. In commercial software, common code is factored out depending on application breadth and difficulty of reimplementing, in other words, financial incentive. Database code was the first to be so factored, followed more recently by GUI development. Domain specific code is often the main asset of a vertical software developer, and so stays proprietary; within such a company common code slowly migrates out to shared classes/libraries with continued product iteration.

Scientific applications in the past have typically been coded in Fortran by scientists to implement mathematical algorithms. There has been a strong tradition of sharing source code between scientific colleagues, and of making such programs freely available. However, the lack of (even disdain for) software engineering knowledge or practice has limited the reusability and flexibility of scientific applications.

Wide availability of workstations and PC’s with robust graphics capabilities has led to an explosion of scientific analysis and display software systems. These programs are typically developed by one or a few "heroic" scientist/programmers, with a much improved grasp of software engineering principles. Using a mixture of Fortran and C, these programs also tend to be large, monolithic, and platform-dependent. Much effort is spent porting them to Unix variants, and it is unusual when a complex GUI-based application can run across Unix, Windows, and Mac computers. Like commercial software, these scientific systems eventually die because they reach a level of size and complexity where no one can easily understand how to fix problems or add new features.

3.1 Java

The Java platform coupled with new design methodologies using software components provides a fresh opportunity for code sharing among scientific programmers. The Java language offers new hope for platform independence, and equally important it provides standard class libraries for GUI development, networking, image manipulation, etc. JavaBean component conventions provide an accessible, standardized way of partitioning and packaging software for use by other scientific programmers. The problems of "black-box" component assembly are largely mitigated when source is available, as is typical in scientific programming.

3.2 Open Source Software

Availability of source code distinguishes publicly funded scientific development from commercial software. Recently, the "Open Source Software" movement has grown out of the Free Software Foundation (FSF) GNU project and the Apache web server and Linux OS development projects, to be partially embraced by commercial software companies such as Netscape, Corel, and IBM [Raymond]. Open Source partisans argue that source availability allows parallel bug fixes and development by a global Internet-connected army of programmers.

The similarities between the science community and the free software communities like Linux and FSF are noteworthy. Both are motivated by peer recognition for individual contributions to community knowledge and resources, which Eric Raymond calls a "post-scarcity gift culture". He argues that

Raymond refers to science research itself reusing the work of other researchers. Also, scientific applications, and even reusable scientific/math libraries (source and all) are given away as gifts to the scientific community. But to my knowledge, there has not been any significant collaborative scientific software development effort. Linux and other open source efforts have proven conclusively that collaborative software development can create sophisticated software of the highest quality. The scientific community should understand from the open source software movement what factors are needed to create an environment for shared, parallel software development.

4. EXPERIENCE AT UNIDATA

Unidata support of meteorological data acquisition for a diverse community of universities has led us to adopt Java as our development and deployment platform for future applications [Fulker 97]. We are currently prototyping a new generation of Java based meteorological data visualization applications, in part using JavaBean components.

The design of our initial prototype has been based on the simple but powerful idea of allowing "data provider" components to be developed independently from "data consumer" components such as analysis and visualization programs. If successful, this would allow n data sources and m data consumers to be implemented using (on the order of) n + m objects, rather than n x m objects. This implies abstract data types (ADT) that describe the data that needs to be shared between the provider and consumer.

Our current goal is to create such ADTs in the form of Java interfaces that capture the content of the data sources at a level of abstraction that allows a small collection of data types to represent as wide a range of data sources as possible. Further, those data types should closely match the semantics of the data source, so that new data providers can be added easily and independently.

We also envision adding helper and wrapper classes to the data provider ADTs. The helper classes will factor common code, and the wrapper classes will allow us to create interfaces for data consumers independent of the data provider interfaces. We expect that matching data source and data provider semantics independently will be important. Much iteration will be needed before these interfaces will be correct.

Preliminary conclusions from the first iteration of prototyping are:

1) Component software development is harder, not easier than traditional software development. Components require at least one more level of abstraction in their design.

2) Semantic and syntactic descriptions of the data objects to be shared by the components must be specified. These data types must closely match the data types and concepts that application programmers are already familiar with. The best way to do this is iterative development alternating between data interface specification and implementation.

3) A component application framework is needed, consisting of well-defined data types and helper classes that factor common code and provide commonly needed services.

4) The attention of the community of scientific application developers needs to be drawn to CBSE. Developers need to be convinced of the overall usefulness of component software by prototypes and demonstration projects.

References

[Brown 98] Alan W. Brown and Kurt C. Wallnau, "The Current State of CBSE", IEEE Software, Sep. 1998.

[Ben-Shaul 98] Israel Ben-Shaul, James W. Gish, William Robinson, "An Integrated Network Component Architecture", IEEE Software, Sep. 1998.

[Digre 98] Tom Digre, "Business Object Component Architecture", IEEE Software, Sep. 1998.

[Fulker 97] David Fulker, Tom Yoksas, Russell Rew, Glenn Davis, "Unidata’s Path to Platform Independence", 14th IIPS, 1997.

[Raymond] Eric Raymond, Open Source Software

[Raymond 98] Eric Raymond, "Homesteading the Noosphere"

[Snyder 86] Alan Snyder, "Encapsulation and Inheritance in Object-Oriented Programming Language", OOPSLA'86 Conference Proceedings, pages 38-45, Sep. 1986.

[Szyperski 98] Clemens Szyperski, Component Software, Addison-Wesley Longman, 1998.

 
 
  Contact Us     Site Map     Search     Terms and Conditions     Privacy Policy     Participation Policy
 
National Science Foundation (NSF) UCAR Office of Programs University Corporation for Atmospheric Research (UCAR)   Unidata is a member of the UCAR Office of Programs, is managed by the University Corporation for Atmospheric Research, and is sponsored by the National Science Foundation.
P.O. Box 3000     Boulder, CO 80307-3000 USA     Tel: 303-497-8643     Fax: 303-497-8690