1. Technical Field
This disclosure relates to black-box testing of Graphical User Interface (GUI)-based Applications (GAPs).
2. Related Art
Manual black-box testing of GAPs is tedious and laborious, since nontrivial GAPs contain hundreds of GUI screens and thousands of GUI objects. Test automation plays a key role in reducing the high cost of testing GAPs. In order to automate this process, test engineers write programs using scripting languages (e.g., JavaScript and VBScript), and these programs (test scripts) mimic users by performing actions on GUI objects of these GAPs using some underlying testing frameworks. Extra effort put in writing test scripts pays off when these scripts are run repeatedly to determine if GAPs behave as desired.
Unfortunately, releasing new versions of GAPs with modified GUIs breaks their corresponding test scripts thereby obliterating the benefits of test automation. Consider a situation of a list box replaced with a text box in the successive release of some GAP. Test script statements that select different values in this list box will result in exceptions when executed on the text box. This simple modification may invalidate many statements in test scripts that reference this GUI object. Maintaining test scripts involves changing its code to keep up with changes to their corresponding GAPs.
This and many other similar modifications are typical between successive releases of different GAPs, including such well-known GAPs as Adobe Acrobat Reader and Microsoft Word. As many as 74% of the test cases become unusable during GUI regression testing, and some evaluations of automated testing have shown that even simple modifications to GUIs result in 30% to 70% changes to test scripts. To reuse these scripts, test engineers should fix them. For example, scores of test engineers need to be employed to fix test scripts both manually and using different testing tools. The annual cost of manual maintenance and evolution of test scripts is enormous, and may run into the tens or hundreds of millions of dollars in large organizations.
Currently, there are two main modes of maintaining test scripts: tool-based and manual. Existing testing tools detect exceptions in test scripts at runtime, i.e., test engineers run these scripts in order to execute statements that reference modified GUI objects. Exceptions interrupt continuous testing and they require human intervention to fix them.
Unlike compilers that check unit tests against the program code, test scripts are based on different type systems than GAPs that they test. As it turns out, multiple disparate type systems make GUI testing very difficult. Existing regression testing approaches work in settings where test harnesses are written in the same language and use the same type system as the programs that these harnesses test (e.g., JUnit test harnesses are applied to Java programs). In contrast, when testing GAPs two type systems are involved: the type system of the language in which the source code of the GAP is written and the type system of the language in which test scripts are written. When the type of the GUI object is modified, the type system of the test script “does not know” that this modification occurred, thereby aggravating the process of maintaining and evolving test scripts.
As a result, tool-based approaches provide maintenance modes that allow testers to find broken statements in test scripts by executing these statements line-by-line against GAPs. The presence of loops in test scripts make them run for a long time in order to reach statements that should be checked. Test engineers comment out loops, but their modifications may change testing logic and mask broken statements. Finally, commercial testing tools are expensive (e.g., a license for one of the flagship industry tools costs more than $10,000).
On the other hand, manual maintenance of test scripts is popular among test professionals. During manual maintenance testers determine differences between successive release of GAPs and they locate and fix statements in test scripts that are affected by these changes. Since the sizes of test scripts are much smaller than the GAPs that these scripts act upon (e.g., many scripts are smaller than 1KLOC), it is feasible for testers to understand and fix them. In addition, testers are perceived to do a more thorough job of understanding and fixing scripts if they do not rely heavily on tool support. However, some test engineers lack time and necessary skills to understand and fix old scripts, especially if these scripts were created by other engineers.
Currently, testers run test scripts that are written for the previous releases of a GAP on the successive releases of this GAP to determine if these scripts can be reused. The testers may use existing tools that include a script debugger (e.g., QTP from Hewlett Packard™ company). Once a statement that accesses a modified GUI object is reached, the testing platform generates an exception and terminates the execution of the script. The engineer analyzes the exception, fixes the statement, and reruns the script again. This process is repeated until the script runs without throwing any exceptions.
Often it takes a long time until statements that reference changed GUI objects are executed. Test scripts contain loops, branches, and fragments of code that implement complicated testing logic in addition to statements that access GUI objects. Consider a test script that contains a loop with code that reads in and analyzes data from files, computes some result from this data, and inserts it in a GUI object. Computing this result may take hours depending on the sizes of the files. Test scripts often contain multiple computationally intensive loops that are interspersed with statements that access GUI objects. Each time an exception is thrown because of a failure, the results of the execution are discarded, and the script should be rerun after engineers fix this failure. Commenting out loops (when possible) speeds up execution, but it changes the logic of test scripts, and subsequently the quality of repairs.
In addition, existing testing tools provide little information about how to fix failures in test scripts. When a test script is executed against a new version of the GAP, existing tools have no information about changes between GUI objects that lead to exceptions. As a result, test engineers must analyze GUIs manually to obtain this information and relate it to the exceptions, and this is a laborious and intellectually intensive process.
When fixing failures in test scripts manually, testers examine GUIs of two consecutive releases of some GAP to determine what GUI objects are modified. In addition, testers with advanced skills as well as programmers study the source code of GAPs (if it is available) to understand these changes in depth. Learning how GAPs are modified between released versions and relating these changes to statements and operations in test scripts may have a beneficial learning effect on testers. Without relying on tool support, testers are thought to do a more thorough job of finding and fixing failures in test scripts.
It is not clear if the manual approach has definite benefits over the tool-based approach. On one hand, testing tools are expensive and may take a long time to execute scripts to determine what statements are broken because of changes made to GUI objects between successive releases of GAPs. On the other hand, the manual approach requires testers to go over each statement and operation in test scripts to understand what GUI objects they refer to, and it is laborious and expensive.
What is needed is a sound and complete approach. A sound approach ensures the absence of failures in test scripts if it reports that no failures exist, or if all reported failures do in fact exist, and a complete approach reports all failures, or no failures for correct scripts. Both manual and tool-based approaches allow testers to detect some failures that result from modifications of GUI objects, however it is unclear with what degree of precision.
Therefore, a need exists to address the problems noted above and others previously experienced.