1. Field of the Invention
The present invention relates generally to computer device testing. More specifically, the present invention relates to a system for automated regression failure management.
2. Description of the Related Art
Creating computer chips involves a number of facets before the fabrication of the physical device itself. Typically, software code is written that identifies the various components of the integrated circuit. This software code will eventually be used to create the actual chip itself, but utilizing the software beforehand allows the designer to run various simulations to determine the proposed design's efficiency and effectiveness.
During these simulations, various inputs are tested and then outputs are measured to ensure that the outputs match what is expected given the corresponding inputs. Any discrepancy may reveal an error in the design, necessitating attention from the designer, who may fix the error, or even completely redesign the chip, due to the results of the simulations.
These simulations are operated using a simulation environment in which various tests may be created, and then these tests can be applied to the proposed design. One common type of testing is known as regression testing. In regression testing, when an error is found and (believed to be) fixed, the modified program is partially rerun to ensure that no additional errors were introduced in the process of fixing the original problem(s). Modern regression software environments automate this process, allowing the test suite to automatically rerun all necessary regression tests.
The intent of the testing is to provide some reassurance that the product is behaving as expected, that modifications to a product for the purposes of fixing a defect have indeed fixed that defect, and that those modifications have had no unwanted effects on other functionality or behaviors of the product.
It is common also to introduce randomness into the test suite, to more accurately represent real-world behavior, which can often be unpredictable, and to prevent the preconceived notions of the person running the test suite from unduly limiting the types of test inputs provided. While this is the desired behavior, it represents a unique constraint in reproducing the particular stimulus that generated a behavioral failure in the product, since successive regressions will not necessarily stimulate the same behaviors or cover the same functionality in the product. This can be solved by capturing the particular random seed of a failing test and then re-running that test again at a later time using the captured seed, in an attempt to reproduce the exact same stimulus that exposed a failing behavior.
It is quite common for many thousands of test simulations to be performed on a single piece of software daily. Through this testing, it can also take many regression runs to adequately simulate all of the required behavioral functionality of the product. Any failure in any instance of a test requires a human to analyze the results of that test and investigate possible root causes of the failure.
Due to the large number of failures that can be detected over the course of these multiple regression runs, it can be too much for a product development team to efficiently investigate all of the errors in a reasonable amount of time. To aid with this, some abstractions can be made. These abstractions typically take the form of programmatically parsing the failure result and creating a defect signature. A single defect signature can then represent some larger number of individual failing test instances. Although this may not be precise, it enables the development team to efficiently investigate the most likely causes of the errors.
Each failure signature is entered into a defect tracking database and a status is kept on that signature as the development process (investigate, analyze, fix, resolve to closure) is followed.
Thus, for example, a test input may include various command lines including an address, a command, and data. This means that the test should run the command at the address using the data specified. It may be that the data is 16 bits, so there are 216 possible data combinations that are tested. However, in many cases, if the command is going to produce an error, it will do so regardless of the exact data provided. As such, a single failure signature identifying the command and the address is maintained, rather than maintaining a signature for each of the 216 possible data combinations. This signature is generated by parsing a log file containing the results of the tests, and pulling out these signatures into a defect database.
There are a number of issues, however, presented by these methods. There are too many test instance failures to investigate and analyze efficiently, necessitating abstraction of multiple failing test instances into a single defect signature, losing potentially valuable test information. Additionally, because of the abstraction of the failure data, it is difficult to determine whether any one particular fix or a single signature is actually applicable to all of the abstracted failures for that signature. Finally, because of the randomness of a regression test suite, the re-production of the stimulus that reveals any particular failure may never recur.