1. Technical Field
The present invention relates generally to computer processing systems and, in particular, to a method for generating a software module from multiple software modules based on extraction and composition.
2. Background Description
Separation of concerns is at the core of software engineering. In its most general form, separation of concerns refers to the ability to identify, encapsulate, and manipulate only those parts of software that are relevant to a particular concept, goal, or purpose. Concerns are the primary motivation for organizing and decomposing software into manageable and comprehensible parts. Separation of concerns is further described by David L. Parnas, in xe2x80x9cOn the Criteria To Be Used in Decomposing Systems into Modulesxe2x80x9d, Communications of the ACM, Vol. 15, No. 12, December 1972.
Many different kinds, or dimensions, of concerns may be relevant to different developers in different roles, or at different stages of the software lifecycle. For example, the prevalent kind of concern in object-oriented programming is data or class; each concern in this dimension is a data type defined and encapsulated by a class. Features, like printing, persistence, and display capabilities, are also common concerns, as are aspects, like concurrency control and distribution, roles, viewpoints, variants, and configurations. Separation of concerns involves decomposition of software according to one or more dimensions of concern. Features, aspects, roles, viewpoints are respectively described by: Fuggetta et al., in xe2x80x9cFeature Engineeringxe2x80x9d, Proceedings of the 9th International Workshop on Software Specification and Design, pp. 162-64, April 1998; Irwin et al., in xe2x80x9cAspect-Oriented Programmingxe2x80x9d, Proceedings of the European Conference on Object-Oriented Programming (ECOOP), Finland, Springer-Verlag, LNCS 1241, June 1997; Andersen et al., in xe2x80x9cSystem Design by Composing Structures of Interacting Objectsxe2x80x9d, Proceedings of the European Conference on Object-Oriented Programming (ECOOP), June/July 1992; and Finkelstein et al., in xe2x80x9cA Framework for Expressing the Relationships Between Multiple Views in Requirements Specificationsxe2x80x9d, Transactions on Software Engineering, Vol. 20, No. 10, pp. 260-773, Oct. 1994
Separation of concerns has been hypothesized to reduce software complexity and improve comprehensibility; promote traceability within and across artifacts and throughout the lifecycle; limit the impact of change, facilitating evolution and non-invasive adaptation and customization; facilitate reuse; and simplify component integration.
These goals, while laudable and important, have not yet been achieved in practice. This is because the set of relevant concerns varies over time and is context-sensitive. Different development activities, stages of the software lifecycle, developers, and roles often involve concerns of dramatically different kinds. One concern may promote some goals and activities, while impeding others. Thus, any criterion for decomposition will be appropriate for some contexts, but not for all. Further, multiple kinds of concerns may be simultaneously relevant, and they may overlap and interact, as features and classes do. Therefore, different concerns and modularizations are needed for different purposes. The different purposes may differently implicate class, feature, viewpoint, aspect, role, variant, or other criterion.
These considerations imply the need for xe2x80x9cmulti-dimensional separation of concernsxe2x80x9d. Developers must be able to identify, encapsulate, modularize, and manipulate multiple dimensions of concern simultaneously, and to introduce new concerns and dimensions at any point during the software lifecycle, without suffering the effects of invasive modification and rearchitecture. However, even modern languages and methodologies suffer from a problem that has been referred to as the xe2x80x9ctyranny of the dominant decompositionxe2x80x9d. That is, the languages and methodologies permit the separation and encapsulation of only one kind of concern at a time. A body of software can generally be decomposed in only one way, just as a typical document is divided into sections and subsections in only one way. This one decomposition is dominant, and often excludes any other form of decomposition. The xe2x80x9ctyranny of the dominant decompositionxe2x80x9d is further described by Harrison et al., in xe2x80x9cN Degrees of Separation: Multi-Dimensional Separation of Concernsxe2x80x9d, Proceedings of the 21st International Conference on Software Engineering, pp. 107-19, May 1999.
Examples of tyrant decompositions are classes (in object-oriented languages), functions (in functional languages), and rules (in rule-based systems). Therefore, it is impossible to encapsulate and manipulate, for example, features in the object-oriented paradigm, or objects in rule-based systems. Accordingly, it is impossible to obtain the benefits of different decomposition dimensions throughout the software lifecycle. Developers of an artifact are forced to commit to one (or only a few at most), dominant dimension(s) early in the development of that artifact, and changing their choice can often have catastrophic consequences for the existing artifact. Further, since artifact languages often constrain the choice of dominant dimension (e.g., it must be class in object-oriented software), different artifacts (e.g., requirements and design documents) might be forced to use different decompositions, thus obscuring the relationships between them.
A particular decomposition of a body of software is a set of xe2x80x9cmodulesxe2x80x9d into which the software is divided. Modules can be nested within one another, and can be related in other ways. The intent is that each module encapsulate some particular concern. That is, all the software, and only the software, that pertains to that concern is contained within the module. Systems are built by selecting and composing modules. For example, modules in the JAVA programming language (henceforth referred to as xe2x80x9cJAVAxe2x80x9d) are packages, classes and interfaces. Classes and interfaces enforce xe2x80x9cdata abstractionxe2x80x9d. Each is concerned with a particular data structure, and encapsulates all internal details of that data structure. All code is written within classes and interfaces, which in turn are grouped into packages. A system is built by selecting the packages, classes and interfaces to include.
Choice of decomposition, which implies choice of modules, is important because it determines which concerns are encapsulated within modules. These concerns can be more easily understood, because all the software pertaining to them is localized in the module. Moreover, these concerns can be modified with reduced impact, because changes are usually localized within the module. Further, these concerns can be used as the basis for system configuration. That is, these concerns can be selected for inclusion in, or exclusion from, systems. For example, in a standard JAVA system, packages, classes and interfaces can be included or excluded. In a system decomposed by feature, features can be included or excluded. However, since typical features involve portions of multiple classes, features cannot be used as the basis for configuration in standard JAVA, and classes could not be used as the basis for configuration in a feature-based decomposition. Lastly, these concerns can be used as a unit of reuse. However, concerns that are not encapsulated within modules typically cut across many modules. Such concerns are not localized and, therefore, do not enjoy the above described benefits.
The tyranny of the dominant decomposition forces a single decomposition on a body of software, thereby conferring benefits on a particular kind of concern at the expense of other concerns. Currently, it is believed that the tyranny of the-dominant decomposition is the single most significant cause of the failure to achieve many of the expected benefits of separation of concerns.
3. Problems with the State of the Art
A brief description will now be given of the prior art that is concerned with function extraction. In some restructuring tools, the user is allowed to select some contiguous code and request that the code be extracted and made into a function. A call to this function is substituted at the original location. The function extraction method performs an analysis to find any variables that are referred to, but are not declared, within the selected code or globally (i.e., free variables), and creates parameter declarations for these variables. However, as noted above, this approach suffers the disadvantage of requiring that the selected code be a contiguous chunk coming from a single module. Thus, the approach does not address the problem of modularizing code that is scattered across modules. Restructuring tools are further described by Griswold et al., in xe2x80x9cAutomated Assistance for Program Restructuringxe2x80x9d, Transactions on Software Engineering Methodologyxe2x80x9d, ACM, July 1993; and W. G. Griswold, in xe2x80x9cProgram Restructuring as an Aid to Software Maintenancexe2x80x9d, Ph.D. Thesis, Technical Report 91-08-04, Department of Computer Science and Engineering, University of Washington, July 1991.
A brief description will now be given of the prior art that is concerned with program slicing. Tools have been built that allow the user to select some variables and some location within a program. The slicing method determines all statements that contribute to the value of those variables at that location. Slicing typically involves a single module as input. However, inter-procedural slicing involves multiple input modules. The result of slicing is typically identification of the relevant statements, but not extraction into a semantically correct module that can later be composed with other modules. Thus, composition of such slices is an unsolved problem. Program slicing is further described by M. Weiser, in xe2x80x9cProgram Slicingxe2x80x9d, IEEE Transactions on Software Engineering, SE-10(4): 352-357, July 1984.
A brief description will now be given of the prior art that is concerned with compaction. The user selects some program elements (e.g., all code written by the user, but not library functions). The compaction method also extracts all other code, including library code, upon which the selected code depends, but not extraneous code (or, at the least, it reduces the amount of extraneous code that remains in the program). The result is a complete program whose executions are the same as those of the original program. Compaction cannot isolate concerns in the required manner, because typically the concerns to be separated are interrelated. Compaction is designed to bring in all related software to produce a complete, runnable program, so it will not accomplish separation to any extent. Compaction is further described by Laffra et al., in xe2x80x9cPractical Experience with an Application Extractor for JAVAxe2x80x9d, Proceedings of the Fourteenth Annual Conference on Object-Oriented Programming System, Languages, and Applications (OOPSLA ""99), Denver, Col, pp. 292-305, Nov. 1999.
A brief description will now be given of the prior art that is concerned with linking. Separate functions, classes or similar modules can be xe2x80x9ccomposedxe2x80x9d by linking them together with standard program linkers. This is the oldest and most widespread composition technology. However, linking does not involve extracting new modules from existing software.
There are various modern approaches to separation of concerns, such as, for example, subject-oriented programming and aspect-oriented programming, that allow xe2x80x9ccross-cutting modulesxe2x80x9d to be written that collect relevant portions of software that would normally be scattered across many programming language modules. These permit, for example, the various members of various classes that together implement a feature to be coded into a single feature module. Composition (sometimes referred to as xe2x80x9cweavingxe2x80x9d) is used to combine these cross-cutting modules into regular programming language modules, which can then be executed. However, these approaches require the software to be decomposed xe2x80x9cwhen originally writtenxe2x80x9d into the desired cross-cutting modules. The approaches do not support on-demand remodularization. That is, the approaches do not support the ability to extract new modules from existing software to support a new decomposition, without editing the existing software. Subject-oriented programming and aspect-oriented programming are respectively described by: Harrison et al., in xe2x80x9cSubject-Oriented Programming (a critique of pure objects)xe2x80x9d, Proceedings of the Conference on Object-Oriented Programming: Systems, Languages, and Applications, ACM, pp. 411-28, Sept. 1993; and Irwin et al., in xe2x80x9cAspect-Oriented Programmingxe2x80x9d, Proceedings of the European Conference on Object-Oriented Programming (ECOOP), Finland, Springer-Verlag, LNCS 1241, June 1997.
Thus, there is a need for a method for generating a software module from multiple software modules which overcomes the above problems of the prior art.
The present invention is directed to a method for generating a software module from multiple software modules based on extraction and composition. The invention allows for xe2x80x9con-demand remodularizationxe2x80x9d. That is, the invention allows a body of software that was written with one (or more) particular decompositions to be decomposed in different ways, without rewriting the software and, in some cases, without even recompiling the software. Modules making up these new decompositions can then be used for composing systems. This approach can be applied to any software development paradigms and languages.
According to a first aspect of the invention, there is provided a method for generating a software module based upon elements from multiple software modules. The method includes the step of extracting a set of elements from the multiple software modules based upon at least one extraction criterion. Any elements in the set that violate at least one correctness and completeness criterion are identified. The violating elements are automatically brought into compliance with the correctness and completeness criterion. A single software module is generated that contains the set of elements.
According to a second aspect of the invention, the step of bringing the violating elements into compliance includes at least one of the steps of: adding at least one element to the set of elements; modifying at least one of the violating elements; and modifying at least one of the non-violating elements.
According to a third aspect of the invention, the extraction criterion is one of predefined and specified by a user.
According to a fourth aspect of the invention, the correctness and completeness. criterion is one of predefined and specified by a user.
According to a fifth aspect of the invention, the correctness and completeness criterion corresponds to a declarative correctness and completeness criterion.
According to a sixth aspect of the invention, the declarative correctness and completeness criterion includes a specification that a given element referenced in the set must also be declared in the set in a manner that is compatible with all uses of the given element.
According to a seventh aspect of the invention, the extraction criterion identifies first elements that are to be extracted, and second elements that are not to be extracted in the event that such second elements are part of said first elements.
According to an eighth aspect of the invention, each of the software modules includes software code or software design artifacts.
According to a ninth aspect of the invention, the extracting step further includes the steps of: classifying the elements in the multiple software modules according to concerns the elements pertain to; representing the concerns by a multi-dimensional space, wherein each dimension represents a type of concern, each coordinate on a dimension represents a concern of that type, and each point in the space represents an element; and representing the extraction criterion in terms of the multi-dimensional space.
According to a tenth aspect of the invention, there is provided a method for generating a software module based upon elements from multiple software modules. The method includes the step of extracting a plurality of sets of elements from the multiple software modules based upon at least one extraction criterion. Any elements in the sets that violate at least one correctness and completeness criterion are identified. The violating elements are automatically brought into compliance with the correctness and completeness criterion. A plurality of single software modules is generated, wherein each of the single software modules contains one of the sets of elements. The plurality of single software modules are composed to form a final, single software module.
According to an eleventh aspect of the invention, the composing step includes the step of composing the plurality of single software modules with one another, with other software modules, or any combination thereof.
According to a twelfth aspect of the invention, the composing step includes the step of determining correspondence between the elements in the plurality of single software modules.
According to a thirteenth aspect of the invention, the composing step further includes the step of combining corresponding elements into the final, single software module.
According to a fourteenth aspect of the invention, there is provided a method for generating a software module based upon elements from multiple software modules. The method includes the step of providing multiple software modules. A set of elements is extracted from the multiple software modules based upon at least one criterion. The set is analyzed to find any elements that are referenced within the set but are not declared within the set. Declarations of the undeclared elements are automatically added to the set, so that the set is declaratively complete. A single software module is generated that contains the set of elements.
According to a fifteenth aspect of the invention, there is provided a method for generating a software module based upon elements from multiple software modules. The method includes the step of extracting a plurality of sets of elements from the multiple software modules based upon at least one criterion. Any elements in the sets that are referenced, but not declared, in the sets are identified. Declarations of the identified elements are automatically added to the respective sets, so that the sets are declaratively complete. A plurality of single software modules are generated, wherein each of the single software modules contains one of the resulting sets of elements. The plurality of single software modules are composed to form a final, single software module, wherein the final, single software module is a semantically correct entity.