A problem in vectorizing compilers is that it can be difficult to reduce the total number of scalar instructions that are required in order to produce the required results in a given loop. This problem is often thought of as two problems, reducing addressing calculations and reducing scalar instructions. However, typical vector compilers do not make this distinction.
A classic example of a type of code which benefits greatly from this algorithm is a stencil code. That is a code which uses several elements which are constant steps away from a central element, i.e. elements A[i][j][k], A[i][j][k+1], A[i][j][k−1], A[i][j+1][k], A[i][j−1][k], A[i][j+1][k+1], . . . are all used in the same loop.
Historically the scalar instruction count in a loop has been reduced by hoisting loop invariants and identifying common subexpressions. Hoisting invariants is a well understood, mature technology; identifying common subexpressions is, however, an expensive process that many compilers fail to do well, due to the computational complexity of the problem.
What is needed is a compiler system and method for reducing the number of scalar instructions that are required in order to produce the required results in a given loop.