One measure of computer code coverage to cover all pieces of the code during testing process is line coverage or decision coverage. Line coverage counts lines of a code which were executed and compares against all lines of the code. Decision coverage counts decisions taken during execution of decision points like, “if” statements. However, even this enhanced measure may not sufficiently cover the entire code and scenarios. The combined lines executed and decisions taken formulate an execution path. One decision point with two possible decisions formulates two execution paths. But, two independent decision points with two possible decisions each formulate potentially up to four possible execution paths, for example, (decission1.1, decission2.1), (decission1.1, decision2.2), (decision1.2, decision2.1) and (decision1.2, decision2.2).
It is not uncommon that a defect reveals itself only on one execution path, while the rest of the execution paths remain correct. Executing each and every path in a testing environment is virtually impossible since the execution paths depend, not only on user driven actions, but also on environment, external conditions (e.g., given status of network connection), network traffic, state of a database, memory size, and the like. Emulating all such conditions is extremely time and resource consuming.
Therefore, testing is sometimes extended with static analysis of extracted execution path. Execution paths are extracted during parsing process. Then, Control Flow Graphs (CFGs) and/or Data Flow Graphs (DFGs) are created. A code analyzer then attempts to follow the created CFGs and DFGs and detect defects. Reducing a problem to isolated, independent execution paths makes such analysis feasible. The problem with this approach is that the number of paths in a computer program is in the order of 2N, where N is number of decision points, which in turn is proportional to the number of lines of code. Therefore, the analysis becomes impractically computational intensive. As a result, this approach is impractical in large systems. There is a need to reduce the complexity of the analysis to a size which is in order of N to make the analysis practical for commercial systems in which N can be as large as 1,000,000.
Another approach to the problem described above is to limit analyzed path length to some arbitrary number of steps. This approach however may rule out paths which contain serious problems but take more steps than the imposed limit.
Another drawback of existing methods for automatically generated unit test cases is the fact that they very seldom represent a real program execution situation or environment. That is, variables are not independent, and cannot get any value, that is, on a given execution path, given variables cannot get any arbitrary value. For example for the statement “if (b<0) {b=a+1}”, the variable b cannot take any value after the statement is executed. On one path, b cannot be less then 0, on another path (b was less then 0), b becomes related to the value of variable a. Additionally, having test cases generated only from recorded variables will not give good code coverage with test cases.
However, not every path in the computer application (although possible from the code construction point of view) is actually executed in the real life application operation environment. Therefore, the number of paths to analyze can be effectively reduced significantly, if there would be a way to determine which paths can be actually executed, and which cannot. Such statement can be translated into execution probability.