1. Technical Field
The present invention relates to data processing and, in particular, to compiling and optimizing software code. Still more particularly, the present invention provides a method, apparatus, and program product for pinning internal slack nodes to improve instruction scheduling.
2. Description of Related Art
Instruction scheduling is a compiler optimization technique for reordering hardware instructions within a computer program to improve the speed that the program executes on a given computer hardware platform. Software pipelining is a compiler optimization technique for reordering hardware instructions within a given loop within a computer program being compiled to minimize the number of cycles required for each iteration of the loop. Specifically, software pipelining seeks to optimize code by overlapping execution of different iterations of the loop.
Modulo scheduling is a technique for performing software pipelining. For more information about software pipelining and modulo scheduling, see Muchnick, Stephen S, “Advanced Compiler Design and Implementation,” Morgan Kaufman, 1997, pp. 548-568. More specifically, modulo scheduling is an algorithm that selects a likely minimum number of cycles that the loop will execute in, often called the minimum initiation interval (II) and places instructions into a schedule of that size, wrapping instructions around the end of the loop into the next iteration(s) until all instructions are scheduled. If the loop fails, modulo scheduling iteratively increases the number of cycles, or II, of the loop and tries to find a schedule that works.
Swing modulo scheduling (SMS) is a specific modulo scheduling algorithm designed to place instructions into the schedule in such a way that the schedule is nearly optimal in number of cycles, length of schedule, and registers used. For more information on swing modulo scheduling, see Llosa et al., “Lifetime-Sensitive Modulo Scheduling in a Production Environment,” IEEE Transactions on Computers, vol. 50, no. 3, March 2001.
SMS comprises three steps. First, the SMS algorithm builds a data dependency graph (DDG) and performs analysis on the graph to calculate height, depth, earliest time, latest time, and slack of each node in the graph. Nodes in the graph correspond to instructions.
In the next step, the SMS algorithm orders the nodes in the graph. The ordering is performed based on the priority given to groups of nodes such that the ordering always grows out from a nucleus of nodes, rather than starting two groups of nodes and connecting them together. An important feature of this step is that the direction of the ordering works in both the forward and backward direction so that the nodes are added to the order that are both predecessors and successors of the nucleus of previously ordered nodes. When considering the first node, or when an independent section of the graph is finished, the next node to be ordered is selected from the pool of unordered nodes based on its priority using minimum earliest time for forward direction and maximum latest time for backward direction. Then, nodes that are predecessors and successors to the pool of nodes are added to the ordering such that whenever possible, nodes that are added only have predecessors or successors already ordered, not both. Pseudo code for the SMS algorithm can be found in Llosa et al., Id.
In the next step, the SMS algorithm schedules the nodes. This part of the algorithm is fairly straightforward. The algorithm examines the nodes in the order from the previous step and places each node as close as possible, while respecting scheduling latencies, to its predecessors or successors. Because the order can change direction freely between moving forward and backward, the scheduling step may be performed in the forward direction and the backward direction, placing nodes so that they are an appropriate number of cycles before successors or after predecessors.