1. Field of the Invention
The present invention generally relates to a system for testing computer programs, and, in particular, to an automated test generator.
2. Description of the Related Art
All software must be tested before it is used. Software quality assurance (QA) is distinct from conventional QA in that efficient, effective and inexpensive methods of testing software are generally not obvious. Recent catastrophic failures in long-distance telephone, airplane control and rocket guidance programs, as well as the poor reliability of many commercially available applications, bear witness that this is an art that is not nearly perfected.
Most currently accepted software testing strategies are centered on the concept of the test design. A test design is a comprehensive compilation of all the tests that will be performed on the program. The test design generally includes the actions to be performed in each test and the expected (passing) results. It may be executed by a human operator, custom programmed in a standard programming language, or, more commonly, automated using a testing tool such as Segue""s QA Partner/QualityWorks(trademark) or Microsoft""s Visual Test(trademark). In theory, the components of a test design cover every possible way in which the program could ever be used. Thus, when every test in the test design passes, the program is known to be completely free of defects.
In practice, this is never the case. No matter how carefully the test design is written, it will always have errors and omissions. Moreover, even if the authors of a test design were perfect, the idea that the test design can be truly comprehensive is flawed. Many defects are due to the interaction of different components of the program, and therefore a particular action may behave defectively only when it is preceded by a particular series of other actions. When one considers that there may be hundreds to thousands of possible actions in a moderately complex program, that a user may initiate hundreds to tens of thousands of actions in a typical session, and that the number of tests needed to xe2x80x9ccomprehensivelyxe2x80x9d cover all these scenarios is the former number raised to the power of the latter, it is apparent that truly comprehensive test designs are not feasible. Thus, most test designs attempt to cover all possible actions individually and occasionally to cover some of the most obvious combinations of features. The success with which this coverage is achieved is largely dependent on the skill and experience of the authors of the test design.
Well-written test designs are generally fairly effective at revealing functional defects, such as defects where a component of the program does not work properly, but the program as a whole continues to function. However, fatal defects or xe2x80x9ccrashesxe2x80x9d, i.e., defects where the entire program ceases to function, are often missed by test designs. This may be because such defects tend to be revealed either by the user initiating an unexpected action or by the interplay of a large number of actions, both of which are situations that tend not to be covered by test designs.
Among the strategies used to supplement test design based testing are data-driven and ad hoc testing. Data-driven testing attempts to address the practical limits on comprehensiveness of test designs by separating actions (e.g., a command to draw a circle) from the data passed with them (e.g., the location and size of the circle). The test can then be repeatedly re-executed, each time using a different data set, which may be randomly generated or drawn from a large collection of stored data sets. This approach may be extremely effective for largely non-interactive data-processing programs. However, since the data driven approach applies the same actions in every execution, when applied to the interactive graphical user interface (GUI) based programs that comprise most of the commercial software market, little is gained over the traditional test design approach.
Although it may be disparaged in testing manuals, most testers do some amount of ad hoc testing. Ad hoc testing refers to a human operator testing the program without a specific written plan. In practice, a significant percentage of defects may be discovered by ad hoc testing. Nevertheless, there are serious problems with ad hoc testing. Its success is highly dependent on the skill of the tester. Since, by nature, it is conducted without a specific plan, it is almost impossible to quantify or ensure the uniformity of coverage achieved by a test design. Further, since ad hoc testing cannot be automated using the current state of the art, it is expensive: one hour of testing costs one hour of human operator time.
Thus, there is a need in the art for new techniques that increase testing efficiency by solving these problems. More specifically, there is a need in the art for techniques that are more effective at detecting fatal defects than traditional test designs and data-driven testing, yet are more quantifiable and less expensive than ad hoc testing.
The present invention solves these problems by providing a mechanism wherein the user interface of a program can be completely described as a graph or network of choices in a programmatically readable form. The testing program presented here is able to generate tests consisting of both random data and random series of actions by randomly traversing this graph of choices. Traditionally, one of the most difficult parts of automated testing is verification: determining whether the test has passed or failed. Complete verification of proper function may be particularly difficult in cases where tests are randomly generated, such as the present invention, since predicting the proper behavior of the application program may require duplication of a substantial portion of the application program""s logic. However, fatal or illegal state defects such as application crashes, memory leaks or inconsistency in internal data structures are easily detected through means such as operating system error messages or assert messages. Since detection of illegal application states is independent of the path by which the application reached the illegal state, verifying that the application has not entered an illegal state is a trivial task even for randomly generated tests. The present invention is focused on discovering fatal defects, so detection of illegal states is generally sufficient verification. Comprehensively describing the interface of even a relatively complex program is a feasible task, so coverage of combinations of actions that is superior to that of a test design is achieved and many fatal defects that would be missed by a test design can be discovered. Further, since the frequency with which any option of any choice is selected can be manipulated using a Monte Carlo statistical technique, testing using this technique is more easily quantified and directed than ad hoc testing. This technique also lends itself to a high degree of automation, making it less expensive than ad hoc testing.
To address the requirements described above, the present invention discloses a method, apparatus, and article of manufacture for automatically generating tests to randomly traverse a description of the user interface of a program being tested, thereby generating tests that contain randomly selected actions and randomly generated data. When executed, these tests randomly manipulate the program being tested. In the preferred embodiment detailed here, the generated tests take the form of test scripts written in the language of a commercially available testing tool, and include programming statements that invoke one or more functions of the application program being tested.