1. Field of the Invention
The present invention relates generally to execution sequence dependency analysis and, more particularly, to cross-iteration dependency and loop parallelization techniques that facilitate preparation and/or optimization of program code.
2. Description of the Related Art
To exploit parallel execution facilities provided by multiprocessor and multi-core computers, some types of program code demand effective loop transformation and/or automatic parallelization techniques. In general, dependence analysis forms a basis for automatic parallelization and for some loop transformations.
Conventional dependence analysis techniques can often be employed for simple loop transformations and automatic parallelization. For example, consider the following simple loop:
do i=1, 100, 3                A(i+8)=A(i)+1        
end do
In order to parallelize the loop, it is important to make sure that A (i+8) does not carry cross-iteration dependence with respect to A (i+8) and A (i).
A variety of conventional techniques have been developed for loops, such as the simple loop above, where array subscripts are linear functions of the enclosing loop indices. Often these techniques, such as a GCD test, Banerjee test, or Fourier-Motzkin test may be successfully employed to determine whether two array references, e.g., A (i+8) in one iteration and A (i) in another, reference the same array location. For example, these techniques are able to determine that the simple loop illustrated above is a DOALL loop, which can be parallelized. However, many loops can contain complex subscripts that are beyond the capabilities of the conventional techniques. One such complexity is presented when an array subscript is a non-linear function of an enclosing loop index.
Since the conventional techniques target the linear subscripts (of loop indices), they are not able to compute dependence exactly for the following example:
do i=1, 100, 3                j=5*i/4        A(j+9)=A(j)+1        
end do
In order to parallelize this second, more complex loop, it is necessary to make sure that A(j+9) does not carry cross-iteration dependence with respect to A(j+9) or A(j). Conventional techniques assume a worst-case dependence between A(j+9) and A(j) and will not typically be able to parallelize the illustrated loop.
In general, techniques are desired that would allow accurate dependency analysis to be performed even for loops in which references (e.g., array subscripts) are non-linear functions of enclosing loop index. In particular, techniques are desired for loops in which the non-linear functions of enclosing loop index include division operations.