1. Technical Field
The present invention relates generally to an improved data processing system and in particular to a method and computer program product for optimizing software pipelining. Still more particularly, the present invention provides a method and computer program product for identifying constrained resources in a loop and modifying a swing modulo schedule based on the identified constrained resources.
2. Description of Related Art
Software pipelining is a compiler optimization technique for reordering the hardware instructions within a computer program loop being compiled such that the number of cycles required for each iteration of the loop is minimized. Particularly, software pipelining seeks to optimize the number of required cycles for execution of the loop by overlapping the execution of different iterations of the loop.
Modulo scheduling is a technique for performing software pipelining. A modulo scheduling algorithm selects a likely minimum number of cycles that the loop may be executed in, often called a minimum initiation interval, and places instructions into a schedule of that size. Instructions are “wrapped” around the end of the loop into the next iterations(s) until all instructions are scheduled. If the number of cycles exceeds the initiation interval, the initiation interval may then be incremented and a schedule having a number of cycles corresponding to the initiation interval is attempted to be found.
Swing modulo scheduling (SMS) is a specific modulo scheduling algorithm designed to place instructions into the schedule in such a way that the schedule is nearly optimal in number of cycles, length of schedule, and registers used. SMS comprises three general steps: building a data dependency graph (DDG), ordering nodes of the DDG, and scheduling nodes.
The DDG graph is analyzed to find strongly connected components (SCCs). SCC are graph components which are cyclic data dependencies. Various parameters, such as height, depth, earliest time, latest time, and slack of each node (where a DDG node corresponds to an instruction) are then determined.
Node ordering of the DDG is performed based on the priority given to groups of node such that the ordering rows out from a nucleus of nodes rather than starting two group of nodes and connecting them together. An important feature of this step is that the direction of ordering works in both the forward and backward direction so that nodes are added to the order that are both predecessors and successors of the nucleus of previously ordered nodes. When considering the first node, or when an independent section of the graph is finished, the next node to be ordered is selected based on its priority (using minimum earliest time for forward direction and maximum latest time for backward direction). Then, nodes that are predecessors and successors to the pool of nodes are added to the ordering such that whenever possible nodes that are added only have predecessors or successors already ordered, not both.
The SMS algorithm for performing scheduling of the nodes evaluates the nodes in the order generated as previously described and places the node as close as possible (while respecting scheduling latencies) to its predecessors and successors. Because the order selecting in the node order can change directions between moving backward and forward, the nodes are scheduled such that they are an appropriate number of cycles before successors or after predecessors.
One of the most difficult types of loops to schedule is when one particular machine resource is heavily used by a large number of instructions in a loop. Examples of possible types of constrained resources are a particular hardware execution unit, or a class of registers. Scheduling loops when one particular machine resource is heavily used by a large number of instructions in the loop is particularly problematic with a conventional SMS algorithm implementation. Most loops in computer programs can be considered resource constrained since most loops consume one type of machine resource more heavily than other types. As referred to herein, these types of loops are called “resource constrained” because the heavy usage of a particular resource makes it difficult to freely place instructions that use that resource into a schedule. Instructions represented by nodes in the DDG can be said to be resource constrained if they make use of the resource that is heavily used for a particular loop.
It is often difficult to schedule resource constrained loops in an optimal number of cycles due to the contention for the constrained resource using a conventional SMS algorithm. The high contention for a particular resource often results in nodes not being placed in an optimal location in the schedule which can lead to schedules that are less than optimal in terms of number of cycles and register usage.
Thus, it would be advantageous to provide a mechanism for optimizing an SMS algorithm. It would further be advantageous to provide a mechanism for optimizing instructions scheduling based on instruction contention for constrained resources.