1. Technical Field
The present invention relates generally to an improved data processing system, and in particular to a method and apparatus for processing data. Still more particularly, the present invention relates to a method, apparatus, and computer instructions for compiling code.
2. Description of Related Art
Software pipelining is a compiler optimization technique for reordering hardware instructions within a given loop of a computer program being compiled, so as to minimize the number of cycles required to execute each iteration of the loop. More specifically, software pipelining attempts to optimize the scheduling of such hardware instructions by overlapping the execution of instructions from multiple iterations of the loop.
For the purposes of the present discussion, it may be helpful to introduce some commonly used terms in software pipelining. As well known in the art, individual machine instructions in a computer program may be represented as “nodes” having assigned node numbers, and the dependencies and latencies between the various instructions may be represented as “edges” between nodes in a data dependency graph (“DDG”). A grouping of related instructions, as represented by a grouping of interconnected nodes in a data dependency graph, is commonly known as a “sub-graph”. If the nodes of one sub-graph have no dependencies on nodes of another sub-graph, these two sub-graphs may be said to be “independent” of each other.
Software pipelining techniques may be used to attempt to optimally schedule the nodes of the sub-graphs found in a data dependency graph. A well-known technique for performing software pipelining is “modulo scheduling”. Based on certain calculations, modulo scheduling selects a likely minimum number of cycles that the loops of a computer program will execute in, usually called the initiation interval (“II”), and attempts to place all of the instructions into a schedule of that size. Using this technique, instructions are placed in a schedule consisting of the number of cycles equal to the initiation interval. If, while scheduling, some instructions do not fit within initiation interval cycles, then these instructions are wrapped around the end of the schedule into the next iteration, or iterations, of the schedule. If an instruction is wrapped into a successive iteration, the instruction executes and consumes machine resources as though it were placed in the cycle equal to a placed cycle % (modulo operator) initiation interval.
Thus, for example, if an instruction is placed in cycle “10”, and the initiation interval is 7, then the instruction would execute and consume resources at cycle “3” in another iteration of the scheduled loop. When some instructions of a loop are placed in successive iterations of the schedule, the result is a schedule that overlaps the execution of instructions from multiple iterations of the original loop. If the scheduling fails to place all of the instructions for a given initiation interval, the modulo scheduling technique iteratively increases the initiation interval of the schedule and tries to complete the schedule again. This is repeated until the scheduling is completed.
Swing modulo scheduling (SMS) is a known modulo scheduling technique designed to improve upon other known modulo scheduling techniques in terms of the number of cycles, length of the schedule, and registers used. More information on swing modulo scheduling may be found in Llosa et al., Lifetime-Sensitive Modulo Scheduling in a Production Environment, IEEE Transactions on Computers, vol. 50, no. 3, March 2001, pp. 234-249. Swing modulo scheduling has some distinct features. For example, swing modulo scheduling allows scheduling of instructions (i.e. nodes in a data dependency graph) in a prioritized order, and it allows placement of the instructions in the schedule to occur in both “forward” and “backward” directions.
In certain situations, swing modulo scheduling and other known software pipelining techniques may fail to find an optimal schedule. In particular, finding the optimal schedule may be difficult when there are multiple groups of instructions (i.e. sub-graphs) which are independent, and substantially identical in structure (for example, this may result from “unrolling” a loop of a computer program where there are no dependencies between the unrolled iterations). Attempted scheduling of such independent, and substantially identical groups of instructions using known scheduling techniques may result in a cumulative bunching of instructions at various spots within the schedule. This can lead to less than optimal scheduling of loops in terms of the number of execution cycles (i.e. the initiation interval). Regions of high register pressure (i.e. register pressure hot spots) also may result.
Therefore, it would be advantageous to have an improved method, apparatus and instructions for scheduling execution of instructions.