Newly developed software programs must be thoroughly tested in order to eliminate as many “bugs” or errors as possible before the software is released for widespread public use. Accordingly, development of software is largely a trial and error process. Several different methods for testing software programs have been developed. One conventional approach, generally referred to as beta testing, involves distributing the program, typically under a non-disclosure agreement, to a group of users who use the program for a period of time and report any errors which are encountered to the software developer. Although this type of testing is commonly used in the software industry, it is often found to be very time consuming, adversely affecting the scheduled release of products incorporating the software program. In addition, beta testing can be extremely difficult to control, particularly when a large number of users are provided the beta version of the software. Furthermore, due to the non-systematic use of the program, there is no guarantee that every error, or even most errors, will be identified with this approach, even under circumstances where a large number of users are using the software.
As software is developed on and runs on computers, it is not surprising to find that many of the techniques for automating the testing of software have been implemented in digital computers. A common approach for testing software is the use of test suites. Test suites compare “known good” outputs of a program (for a given set of input) against the current output. Tests that check program file output are easy to implement and can be automated with shell scripts (e.g., Expect available on the Internet). For programs with user interfaces that communicate to standard input/output devices (stdin/stdout), a similar method may be employed. Capture/playback tools are available for recording keyboard input and program output as a person tests a program.
Much of the code written today is for software products with a graphical user interface (GUI), such as Microsoft.®™. Windows.™. In fact, much of software development itself is done within a graphical user interface, with software tool vendors providing products which allow software developers to develop GUI software using visual programming techniques. The Quality Assurance (QA) engineer faces more complex problems when testing GUI software. In particular, GUI programs must behave correctly regardless of which video mode or operating environment is being employed.
Intuitively, testing user interfaces should not be as difficult as testing a complex internal engine, such as a compiler or a real-time, multi-user operating system. In practice, however, user interface (UI) testing is the most challenging part of the QA process. This problem stems largely from the difficulty in automating UI tests. Tests for complex engines, in contrast, are often command-line programs whose testing can easily be automated using simple batch execution. Thus despite the plethora of present day tools for automating program testing, the task of developing, maintaining and analyzing the results of UI tests remains an arduous task.
The basic steps traditionally employed to test user interfaces may be summarized as follows. First, the application being tested is controlled by placing it into a specific state using either pre-recorded keyboard or mouse device actions, or entering input through a test script. Next, the then-current state of the application is recorded by taking a screenshot (e.g., capturing a screen bitmap). Finally, the captured screenshot is compared with a baseline screenshot that is known to be valid.
The approach is far from ideal, however. Consider, for instance, the determination of whether the state of a check box is valid within a specific dialog box. Here, the QA engineer must take a screenshot of that check box and compare it with the expected image. Thus, testing of even the simplest component is laborious. Moreover, the approach itself is prone to error. A change of just a few pixels across all windows—a common occurrence in GUI software development—causes all tests to fail. Consequently, as software becomes more and more complex, it becomes less and less feasible to test user interface tasks with present-day screen comparison methodology.
The software testing phase is a critical phase in the software development process. During the software development process, the software testing phase occurs after the software has been designed, implemented in a programming language, and tested to a limited degree. During the testing phase, software testers test the software extensively to ensure that the software meets all of the requirements it is intended to meet. In order to accommodate simultaneous testing of several different software packages by several testers, multiple test machines are often implemented. Different types of software packages may need to be tested on different types of test machines, such as, for example, test machines with different hardware configurations and/or different operating systems. When a large number of software testers are required to share common resources for software testing, provisions must be made for scheduling the tests in order to efficiently manage these shared resources. The efficient management of these shared resources may also require that tests and the results of the tests be recorded so that the tests can be used repeatedly if needed and so that the results of the tests can be analyzed and subsequently used for comparison with the results of tests performed at a later time.
In an effort to maximize efficiency in the handling of test scheduling and test execution, attempts have been made to automate software testing by using a server to manage test machines and to allocate test packages among the test machines in accordance with a schedule. Generally, these types of systems pre-allocate tasks to test machines by calculating the current and scheduled loads on the test machines and scheduling the tasks so that they are performed in a tine-efficient manner. For example, Sun Microsystems, Inc. has proposed an automated task-based scheduler for use with UNIX platform systems which allows users operating “client” machines to schedule tests to be executed on “target” machines. A central server receives a request from a client machine to perform a task. The server maintains information relating to all currently scheduled tasks on all target machines in a “status” database. The server maintains information relating to the expected duration of each test package and other test package attributes in a “packages” database.
When the server receives a request to perform a task from a client machine, the server determines the loads on each of the target machines which are suitable for performing the task. The loads are determined based on the expected duration of each test package. The server then schedules the task on the target machine with the least current load. A task file created at the client machine and copied to the server includes priority information relating to the task requested by the client machine. Once the server has selected a target machine for the task, the task file is copied to the selected target machine. The target machine selects a task to be performed based on this priority information contained in the task file copied to the target machine. Once a task is completed, the results are copied back to the server which compares them to a set of “golden results” and creates a comparison report which is mailed back to the user that requested the test.