Given a large enough market, software consumers will find numerous bugs in software programs, which can reflect poorly on the developer. Thus, an important part of software development is testing to eliminate bugs in software, and to have the software designed to otherwise handle unusual circumstances. That is one reason why Beta testing is used, so that by sheer numbers, many users (who understand that bugs are likely early in the development process) can help to debug a product before it is sold to consumers.
Beta testing is only one way software is tested, and can only be done when the product under development is reasonably stable and safe enough to give to those who will use it in the real world. To get to this point, and also to find bugs that even large numbers of Beta testers may not find, software producers also run their programs through pre-arranged tests. Such pre-arranged testing is often referred to as black-box testing, in which parameters of a domain under test are varied to evaluate how the program behaves.
Black box testing can find many bugs, however there is no way for developers to realistically anticipate each of the possible combinations of parameters that can cause a bug. Testing all combinations is not practical; by way of an example to point out the difficulties in exhaustive testing, consider the “Font” dialog in the Microsoft® Word word processing program. The first “Font” tab of the “Font” dialog has lists of possible parameters for font, style, size, color, underline style, and underline color. Checkboxes are available for effects, including strikethrough, superscript, subscript, shadow, small caps, and others. On this first “Font” tab alone of the “Font” dialog, there are over 1,500,000 parameter combinations that can be tested in order to exhaustively evaluate this tab of the dialog. Moreover, there are many such dialogs and tabs. For example, there is a “Paragraph” dialog, which also has two tabs and a large number of parameters.
As can be readily appreciated, with limited computing resources and time, exhaustive testing of all such parameter combinations in a reasonable time is basically impossible for all but a few programs having very simple sets of parameters. Moreover, the ability to comprehensively test a program becomes even more impractical when each test case takes relatively long time to perform, such as when testing possible variations in installing a program, formatting a disk, and so on, where each test might take on the order of minutes to perform. As a result, even with relatively small domains, exhaustive testing is ordinarily not realistically possible.
Thus, black-box testing for all combinations, one parameter at a time, is simply not a sufficient solution to software testing. One approach is domain modeling, whereby parameters are defined and appropriate values for each of them are chosen by a tester. Then, a suite of test cases is devised out of the parameters. A simple way to test is just to cover a set of values for each parameter, without considering how the parameters combine. In general, based on the tester's own experience, the tester devises test cases that he or she believes covers the most likely interactions between parameters of the domain. As can be appreciated, while this can find some bugs, domain modeling is usually a long and tedious process, and highly dependent on the tester's ability to construct test cases. It also cannot easily be automated and often causes maintenance problems.
Another, more formal approach referred to as “default testing” makes explicit assumptions about how values will be most often combined with one another in typical usage. A tester varies one parameter per test, leaving all others with their default values, and observes how the program behaves. As can be appreciated, this type of test suite is very easy to generate, but like domain modeling is not very effective in finding bugs.
Research has shown that many bugs can be found by simply testing combinations of two parameters. This can significantly reduce the amount of resources needed to test, and thus pair-wise testing has been attempted in various ways. However, at present, such pair-wise testing, and other multiples (e.g., testing with triplets) is not a very developed technology, and is often done manually, which is time consuming and dependent on the skill of the tester. For example, not all pairs are valid, and the tester needs to recognize such constraints when testing.
In addition to testing valid combinations, referred to as “positive testing,” it is often desirable for the tester to test with invalid values, to make sure the program handles such errors properly. Such “negative testing” causes problems when testing with combinations, however, because of the way in which programs are written, namely to take some failure action upon the first error detected. More particularly, a problem known as input masking can occur with negative testing, in which one invalid input prevents another invalid input from being testing.
For example, negative testing of a function that can only handle positive values with a combination of parameter values such as {a=3, b=−1, c=−2} will not completely test these parameters a, b and c (tested as pair-wise combinations) due to input masking, because the first error condition detected, (e.g., −1) will end the testing when applied, before the other error condition (−2) can be tested. Given an efficiently generated set of test cases, there may be no other test case that tests for c=−2, and in such an event, any program bug that corresponds to a failure to recognize an error when parameter c=−2 will be not be found in such black box testing.
Known combinatorial test case generators do not properly handle negative testing. What is needed is a way to efficiently generate combinatorial test cases that supports negative testing in a manner that avoids input masking, while still properly handling positive test cases.