The development and maintenance of a large-scale software system generally consumes a substantial amount of time, effort and financial resources and requires the coordinated interaction of personnel with diverse talents and skills throughout the development process.
Typically, a large-scale software system will comprise a number of software subsystems and each subsystem, in turn, is composed of numerous modules. Each module is designed to perform a selected low-level function. Various modules are then integrated at a higher functional level to form a particular subsystem capable of effecting a specific task. All the subsystems are further integrated at an even higher functional level and thereby provide the full functionality required of the entire system.
An illustrative example of a large software system that finds widespread use in the telecommunications environment is the TIRKS system (TIRKS is a registered trademark of Bell Communications Research, Inc.). At its inception during the early 1970's, the TIRKS system began as a relatively small system. An early version of the TIRKS system was written in assembly language for deployment on a large mainframe computer system. At that time, the TIRKS system had only a few subsystems. Its function was two-fold, namely, to track inter-office circuit orders and to store and inventory inter-office equipment and facilities. To exploit advances in computer and communications technology as well as expanding upon the capabilities of the TIRKS system, the system has been continuously updated and augmented by adding other subsystems and accompanying modules. Currently, the TIRKS system contains about 50 different subsystems embodied in approximately 10 million lines of source code and composed of 23,000 different modules. These subsystems now provide a number of diverse functions, such as: marketing (accepting customer orders for inter-office network services); provisioning (equipment and facility inventorying, order tracking as well as inter-office equipment and facility assignment); and operations (establishing actual inter-office connections as well as the monitoring and testing of the inter-office network).
With this general organizational overview of a large-scale software system in focus, it is instructive to consider the so-called "life cycle" of a software system. The "life cycle" generally may, for purposes of this discussion, be partitioned into a number of serially-occurring, basically mutually exclusive phases. The scope of activity that occurs at each phase varies depending upon whether the software system is entirely new or is on-going and evolving, such as the TIRKS system.
During an initial phase, which might be characterized as a "conceptualization" phase, generic requirements are produced wherein the high-level functionality of the overall system is specified. It is during this phase that the ultimate users of the system elucidate their requirements and objectives to the system analysts. Various alternative solutions aimed at achieving the objectives are proposed and the analysts select the most viable solution and then generate the requirements.
A second phase, the "implementation" phase, commences when the requirements are further analyzed and parsed to obtain a refined system definition. The system is now viewed as comprising an arrangement of stand-alone but interdependent modules. These modules are usually developed separately by different groups or individuals. The development effort at this juncture comprises coding of the modules in a specific computer language, such as the C language, and then testing the execution of the individual modules.
The third phase, called "integration", is initiated by combining appropriate modules into their respective subsystems, thereby forming the overall system. Subsystem testing may also be effected to insure that the modules are properly linked and compatible.
A fourth phase, called "system test", begins when the overall system is handed off to the testers. Thus, instead of releasing the system directly to the end users, an in-house test group is interposed in the "life cycle", with the charter of "trouble-shooting" the intended release. It is well-established that the cost of correcting software defects after a software system has reached the end user is significantly greater than correcting defects found before the system reaches that stage. This has lead to an increased emphasis on reduction of errors during system development, primarily by testing. The objective of the testers is to locate any problems. A "problem" is a discrepancy between what was intended to be implemented and what the system actually does as revealed through testing. In a large system, system testers are faced with the dilemma of how to choose test cases effectively given practical limitations on the number of test cases that can be selected.
A final phase begins when the system is embedded in the user's environment for purposes of acceptance testing. In this phase, particularly if the system is to replace or enhance a similar software system, the replacement system may actually augment the operation of the prior software system. However, the primary purpose of acceptance testing is to determine if the system accomplishes what the user requested and, secondarily, to detect problems for corrective action. After successful completion of this final phase, the system is released to become part of the user's production environment.
Despite all of the testing in the various phases, it is possible that the system may contain an unforeseen error, called a "bug", which will only be discovered during actual use. Each "bug" is eliminated by first detecting its source, and then "conceptualizing" and designing a solution followed by a suitable modification of the system. In effect, each major "bug" effects a new "life cycle". Other causes of a "life cycle" iteration include intended enhancements and modifications or technology changes to take advantage of intervening advances that have occurred since the time the entire system was first developed.
As suggested by the above discussion, a large-scale system development effort is organized around the various phases of the "life cycle". Each organization performs its specified activities and hands off its portions of the system, or "units", to the next organization. Thus, each organization tends to have a local, rather than a global, viewpoint of the overall system.
In order to hand-off the units to the next organization, it is typically required that the units satisfy some quantifiable but limited objective criteria. For example, it might be required that all modules are present, that they are at the proper source code level and are compiled, and that the code is at least executable. Thus, when an organization receives its units of interest, it is expected that the units are operational, even if only at a rudimentary level, so that the receiving organization may quickly begin to perform its activity. However, even though objective criteria are established for hand-off, it is generally true that the ultimate decision to pass on a given unit is mainly the result of a subjective evaluation.
Control of the process during conventional software development tends to be somewhat subjective in nature because, unlike the case of traditional hardware development, there are no sophisticated, objective control procedures that pervade or encompass all phases of the development. In effect, there are no universally applicable methods or techniques that can quickly and accurately perform detailed measurements and report on results within acceptable time limits and resource usage.
As also alluded to in the overview discussion of the "life cycle" phases, testing considerations, either implicitly or explicitly, occur during all development phases. However, controlled formal testing has been concentrated at the system level in the "system test" organization. Principally, this is due to the lack of tools that would efficiently allow for the development and sharing of tests by different organizations.
Historically, the initial method of testing could be characterized as manual and autonomous. With respect to the TIRKS system, which uses a video display terminal (VDT) as an input/output device, "manual" generally means a person positioned at the VDT makes terminal entries, transmits the entries to the unit under test, and then evaluates responses returned for display on the VDT. Based on the response, a new set of terminal entries is entered and a new response is evaluated. This request-response procedure continues until the person is satisfied with all responses or, if "bugs" are detected, until the "bugs" are fixed and the unit is retested. By "autonomous" is meant that only the person making the VDT entries is aware of exactly how and why the unit was exercised during each test session (unless the test exercise was documented, which is typically not the case). In effect, autonomous tests are not tests that can be repeated precisely nor are they in a form to be passed to the next organization.
In order to mitigate the autonomous nature of testing, at least at the overall system level, controlled formal testing was lodged in the "system test" organization. So-called test scripts are developed and maintained by the testers. A test script is a listing of the entries to be supplied to a VDT by the tester as well as a listing of the responses expected for evaluation. Even with this advancement to manual mode testing, clearly it was and is still time consuming and prone to error.
Because of the shortcomings of manual mode testing, so-called automated mode testing was introduced in order to exercise the system by the tester. The test scripts are now supplied to the system under test in a rapid, repeatable fashion. Automated testing assures that the script is followed precisely and evaluations via comparisons of responses on the VDT to stored, anticipated responses may also be done precisely. Automated tests can also be written to be deterministic. By deterministic is meant that during test execution, the test itself can determine if system responses are as expected; if not, the test reports an error. With deterministic tests, the time a tester spends on post-test analysis can be significantly reduced, and virtually eliminated if no unexpected responses are reported. Moreover, system testers are freed to do more productive duties, such as new script development, rather than performing the essentially clerical function of following a script with entries to a VDT. In addition, automated tests can be executed during any time period and do not require that the tester be present to run tests. Also, especially important automated scripts will be maintained to correspond to system updates and enhancements so as to benefit from the deterministic aspects of automated tests. In this way, "regression" testing, that is, comparison of results from a new version of a system to prior, "benchmark" results from an earlier version or previously tested versions are readily accomplished. And finally, automated tests can produce, as a side-effect of the testing itself, a precise machine record of what has been tested and the results of that testing. Therefore, there exists a provable and objective measurement as to the extent that the system under test operates as expected, which is precisely what needs to be measured as a software system moves through its "life cycle".
Automated testing as it is practiced today, however, does have its own deficiencies and limitations. Most troublesome is the need to learn a new computer language. A manual test script, which is carried out by terminal entries, is transmitted to the system under test via a block of data propagating over a channel. An automated test script version of a manual script must provide the equivalent of the data block without the necessity for human interactive terminal entries. In conventional automated testing, the new computer language performs the interfacing function of the off-line building of data blocks equivalent to the desired terminal entries as well as the off-line storing of anticipated responses from the system for comparison to actual returned responses.
Even with automated testing, system testing is still lodged in the system testing organization because of the specialized expertise required of testers. Oftentimes, in fact, it is required that two individuals be teamed to create the equivalent of a single tester. One of the individuals on the team is familiar with the actual application environment and knows the techniques to exercise the system properly but not the special computer language; the second team member is knowledgeable about the special testing language but does not know the application environment in requisite detail. This pairing of individuals is expensive and leads to inefficiencies.
In concluding this Background Section, it is instructive to conjure: (1) a first depiction of the conventional five phases in the life-cycle of the system development process; and (2) a second depiction of a model for an improved process. The second depiction serves as a point of departure for the principles of the present invention. With respect to the first visualization, developers have viewed the five life-cycle phases of one iteration as five activities placed side-by-side with a modicum of interaction between adjacent events and basically no interaction between non-adjacent events. A second iteration causes a second set of five phases to be placed adjacent to the first set so that "acceptance testing" of the first set serves as input to "conceptualization" of the second set. This straight line depiction is replicated throughout the life of the software system as new sets are juxtaposed to existing set groupings.
The second depiction views the life-cycle phases as mapping into a circle partitioned into five "pie-shaped" segments representing the life-cycle events. Now, "acceptance tesing" is adjacent to "conceptualization" and the cyclic nature of the development process is self-evident. Moreover, the center of the circle is common to all segments and represents the knowledge base that is common to all developers during all the phases. It is apparent that non-adjacent events may access, share and utilize the same information as adjacent events. This implies that the information be in a format that is usable by all parties and no party should be burdened with learning, for example, a new computer language to test the system. With this second depiction, it is possible to consider those tools conventionally construed narrowly as testing tools and treat these tools on a broader basis as tools influencing the very creation of the system ultimately to be tested.