A program's instructions are generally of two types: branch instructions and non-branch instructions. Branch instructions determine a next instruction to be executed, based on a value of some program variable. Such branch instructions are typically "expensive" to execute in terms of processing time. It is known that the run-time performance of a program can be improved if the number of branch instructions is reduced. Further, if it can be established that a later-occurring branch instruction is redundant to an earlier-occurring branch instruction, then the Boolean variable resulting from the test associated with the previous branch instruction (i.e., true, false) can be used to replace the later occurring branch instruction and to effect a joinder with subsequent program statements.
Most redundancy elimination algorithms are of two kinds. A lexical algorithm deals with the entire program but can only detect redundancies among computations of lexically identical expressions. An expression is lexically identical to a previous expression if both expressions apply exactly the same operator to exactly the same operands. A value-numbering algorithm, on the other hand, can recognize redundancies among expressions that are lexically different, but that are certain to compute the same value. Value numbering is accomplished by assigning special symbolic names, called value numbers, to expressions. If value numbers assigned to operands of two expressions are identical and if the operators applied by the expressions are identical, then the expressions are certain to have the same values.
In the past, value numbering algorithms have usually been restricted to basic blocks (sequences of computations with no branching) or extended basic blocks (sequences of computations with no joins).
Rosen et al. in "Global Value Numbers and Redundant Computations", Proceedings of the 15th Annual ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages", San Diego, Calif. January 1988, pgs. 12-27, describe a global value numbering algorithm that is employed to associate a symbolic value with every variable definition in a program. The Rosen et al. algorithm guarantees that if two variable definitions have the same symbolic value at compile time, then they have the same actual value at run time. The converse, however, is not true. Thus, if the actual values are the same at run time, it is not necessarily true that the symbolic values are the same at compile time.
The Rosen et al. algorithm associates a defining point (DEFP) to each variable definition (DEF). A DEFP, is a point in a program listing where, for the first time, a value is calculated in the code sequence. The value may be calculated with respect to any variable. A DEF is an assignment of a value to a variable. The relationship between a DEF and its DEFP is specified as follows:
(i) The symbolic value associated with a DEF is equal to the symbolic value associated with the DEFP. PA1 (ii) The basic block containing the DEF is dominated by the basic block containing the DEFP. That is, the basic block containing DEF is executed only if the basic block containing DEFP is executed.
Accordingly, it is known that if a value numbering procedure assigns two values in a program to a partition and appends an identical symbol thereto, that the two values are identical. It is further to be understood that the Rosen et al. procedure is conservative, in that it does not guarantee that every value which may be found to be redundant in a program listing is or may be partitioned so as to be associated with an identical symbol value, but rather that all values which are partitioned and associated with an identical symbol are identical.
Assuming that a redundant branch statement can be identified, the statement can be removed and replaced with an unconditional branch statement or joined with succeeding program statements. An example will serve to illustrate a redundant branch elimination (RBE) procedure. Consider the following program provided as input to a compiler which includes two conditional branch statements (CBSs): ##EQU1## The statement "if (x==c)" can be read as: "if x compared to c indicates an equality, then the result of the test (which forms part of the branch) is a Boolean "true", otherwise the result is a Boolean "false". Further, only if the Boolean value is true is the remaining statement within the brackets { . . . } executed. Thus, provided that the statement S does not alter the value of variable x, the test x==c at branch2 is true if and only if the test x==c is true at branch1. It follows that in any execution of the program, statement T will be executed if and only if statement S is executed. Therefore, the compiler can transform the program into: ##EQU2## Thus, the redundant branch (branch2) in the original program has been eliminated.
A problem in the prior art has been that identification of redundant CBSs has been difficult and time consuming. Accordingly, there is a need for a more accurate method and apparatus for both identification and removal of redundant branch statements from program listings.