Many processors today employ multiple functional units to execute instructions in parallel. By executing instructions in parallel, the performance of the processor increases. When executing instructions in parallel, instructions are generally scheduled to take advantage of instruction-level parallelism (ILP). ILP is a term used to describe instructions that are not dependent on one another, such that they can be executed in parallel without sacrificing the correctness of the program. A discussion of ILP can be found in David A. Patterson & John L. Hennessy, “Computer Architecture: A Quantitative Approach,” 220–370 (Morgan Kaufmann Publishers, 2d ed. 1996).
One mechanism to take advantage of parallelism available among instructions is to exploit parallelism among iterations of a loop. Depending on the relationship between instructions in the loop, instances of instructions from different iterations of the loop can be executed simultaneously without violating dependencies. “Software pipelining” is a term used to describe the concept of simultaneously issuing instructions from multiple iterations of a loop.
“Register assignment,” also referred to as “register renaming,” is a process used to map virtual (or logical) registers to physical registers. Register renaming can create data dependencies that arise in software pipelined loops. Because common registers are used from one loop iteration to the next, data dependencies generally exist between the loop iterations. Software pipelining techniques must be mindful of register-assignment-created data dependencies so as not to violate them.
Some processor designs help to alleviate the problem of dependencies created from register renaming by using “register rotation.” Register rotation is a mechanism that reassigns registers for each iteration of the software pipelined loop so that dependencies caused solely by register renaming are removed. Between each iteration of the software pipelined loop, register assignments are offset by one, so that each iteration uses different physical registers. Register rotation is described in: Intel IA-64 Architecture Software Developer's Manual, Volume 1: IA-64 Application Architecture (Rev. 1.1 July 2000), <ftp://download.intel.com/design/IA-64/Downloads/24531702s.pdf>.
Processor vendors are constantly striving to find more mechanisms to exploit instruction-level parallelism. As the number of functional units in processors continue to increase, issuing more instructions to the various functional units simultaneously will become even more important than it is today.
For the reasons stated above, and for other reasons stated below which will become apparent to those skilled in the art upon reading and understanding the present specification, there is a need in the art for alternate methods and apparatus to perform or accelerate software pipelining.