There are several methods of making a set of processor instructions run faster on a processor, one of which is software pipelining. In software pipelining, a schedule for executing the instructions is produced. This schedule may place several instructions for execution at the same clock cycle, i.e. in parallel. Such parallel execution is valid provided that the instructions which are executed in parallel do not need to use the same resources, such as buses, at the same time is (known as a resource conflict). Additionally, such parallel execution is valid provided that the data dependencies of the instructions are not violated, an example violation being two instructions trying to update the value of a register at the same time.
One particular software pipelining method is known as iterative modulo scheduling and is described in detail in the paper “Iterative Modulo Scheduling”, Rau, Hewlett-Packard Laboratories, ACM 0-89791-707-3/94/0011. An overview of iterative module scheduling is given below. It will be appreciated, though, that this is not a complete description of iterative modulo scheduling; rather it is a sufficient description to provide context for embodiments of the present invention.
Iterative modulo scheduling is a method suited to scheduling a loop of instructions, the loop having a number of iterations. The aim of iterative modulo scheduling is to generate an instruction schedule for one loop iteration such that the schedule may be applied repeated without causing any resource conflicts and without violating any data dependencies. The repeated use of this schedule then implements the loop of instructions.
FIG. 1 of the accompanying drawings is a flowchart showing a high-level overview of iterative modulo scheduling.
The time between successive implementations of the schedule by the processor is known as the initiation interval II. The unit of time is clock cycles. The smaller the initiation interval II, the faster the execution of the loop by the processor. Therefore, at a step S100, a lower bound for the initiation interval II is calculated. This lower bound is known as the minimum initiation interval MII. The calculation of the minimum initiation interval MII will not be described in detail here (as it is not important to the invention), although, in summary, the minimum initiation interval MII takes into account:
(i) for each resource used by instructions of the loop, such as a bus, the usage requirements of that resource by the instructions of the loop; and
(ii) the data dependencies and control-flow dependencies of the various instructions of the loop.
At a step S102, the initiation interval II is set to be the minimum initiation interval MII.
At a step S104, a budget value B is initialized. The budget value B is used to control how long the iterative modulo scheduling method should spend trying to schedule the instructions so that one loop iteration can be executed within the current initiation interval II. For example, the budget value B could be proportional to the number of instructions making up a loop iteration.
At a step S106, a function IterativeSchedule is called, with the current initiation interval II and the budget value B as parameters. The function IterativeSchedule attempts to schedule the instructions of a loop iteration so that a loop iteration can be executed within the current initiation interval II. This will be described in more detail with reference to FIG. 2 of the accompanying drawings.
At a step S108, the result of the function IterativeSchedule is tested. If the function IterativeSchedule returns a value of TRUE, indicating that a schedule has been found, then the iterative modulo scheduling is complete. Otherwise, if the function IterativeSchedule returns a value of FALSE, then a schedule has not been found for the current initiation interval II and processing continues at a step S110.
At the step S110, the initiation interval II is incremented (since scheduling with the current initiation interval II was unsuccessful). Processing then returns to the step S106, so that an attempt can be made to schedule the instructions using the new initiation interval II.
FIG. 2 of the accompanying drawings is a flowchart showing the processing performed by the function IterativeSchedule as it attempts to schedule the instructions for a loop iteration so that a loop iteration may be executed within the current initiation interval II. The function IterativeSchedule has input parameters of (i) the current initiation interval II under test and (ii) the budget value B. The function IterativeSchedule returns a value of TRUE if a schedule has been found for the instructions of a loop iteration and returns a value of FALSE if a schedule has not been found for the instructions of a loop iteration.
At a step S200, the function IterativeSchedule initializes. In particular, for each instruction to be scheduled, an associated flag, NeverScheduled, is set to indicate that that instruction has never been scheduled during this current call of the function IterativeSchedule.
Furthermore, for each instruction, an associated variable, PreviousScheduleTime, is set. This variable PreviousScheduleTime indicates the time within the initiation interval II at which an instruction was previously scheduled. As none of the instructions have yet been scheduled during this call to the function IterativeSchedule, these variables PreviousScheduleTime are initialized with the value of 0.
Additionally, each of the instructions is given a priority such that the function IterativeSchedule will attempt to schedule a first instruction with priority P1 before it attempts to schedule a second instruction with priority P2 if P1>P2. There are a variety of methods for assigning the priorities to the instructions. For example, an instruction that is dependent on the execution of an instruction in a previous iteration of the loop may be given priority over an instruction that is not dependent in this manner.
At a step S202, the function IterativeSchedule determines whether (i) the scheduling has been completed (i.e. whether all of the instructions have been successfully scheduled) or (ii) the budget value B has a value of 0 (indicating that the function IterativeSchedule should “give up” trying to schedule the instructions within the current initiation interval II). If either of these two conditions are satisfied, then processing continues at a step S204, at which the function IterativeSchedule ends by returning either a value of TRUE if a schedule has been found for all of the instructions of a loop iteration or a value of FALSE if a schedule has not been found for all of the instructions of a loop iteration.
Otherwise, processing continues at a step S206 at which the highest priority unscheduled instruction is selected for scheduling.
At a step S208, the function IterativeSchedule determines the earliest time MinTime at which the selected instruction can be scheduled. The time MinTime may actually be greater than the initiation interval II, in which case the selected instruction is scheduled for execution within the next loop iteration. The function IterativeSchedule determines the time MinTime by examining each predecessor instruction (a predecessor instruction being a currently scheduled instruction on which the selected instruction has a dependency), and determining when the selected instruction may be scheduled. This is based on (i) when the predecessor instruction is scheduled for execution and (ii) the delay required between starting execution of the predecessor instruction and starting execution of the selected instruction to avoid any dependency violations.
A latest time MaxTime at which the selected instruction can be scheduled is then calculated. The value of MaxTime is set to be MaxTime=MinTime+II−1.
At a step S210, the function IterativeSchedule determines a time between MinTime and MaxTime at which the selected instruction is to be scheduled. This is done by determining the earliest time between MinTime and MaxTime at which the selected instruction may be scheduled without causing a resource conflict with an already scheduled instruction.
If a time between MinTime and MaxTime at which the selected instruction may be scheduled without a resource conflict does not exist, then the time at which to schedule the selected instruction is set as follows. If the selected instruction has never been scheduled before (as indicated by the associated flag NeverScheduled) or if MinTime is greater than the previous time at which the selected instruction was scheduled (as indicated by the variable PreviousScheduleTime associated with the selected instruction) then the time at which to schedule the selected instruction is set to be MinTime. Otherwise, the time at which to schedule the selected instruction is set to be the next clock cycle after the time indicated by the associated variable PreviousScheduleTime.
At a step S212, the selected instruction is scheduled for execution at the time determined at the step S210. The flag NeverScheduled associated with the selected instruction is set to FALSE, and the variable PreviousScheduleTime associated with the selected instruction is set to be the time determined at the step S210.
At a step S214, the function IterativeSchedule determines whether, having scheduled the selected instruction, a resource conflict exists between the newly scheduled instruction and one or more of the instructions that have already been scheduled. If such a resource conflict exists, then processing continues at a step S216; otherwise, processing continues at a step S218.
At the step S216, the one or more instructions which have a resource conflict with the newly scheduled instruction are unscheduled. There is now no resource conflict in the existing schedule. Other instructions may be unscheduled too.
At the step S218, the budget value B is decremented by 1. The function IterativeSchedule then returns to the step S202.
Software pipelining, as described above, works well on architectures such as those found in personal computers (PCs) which have a large number of registers that can be used in a flexible manner. However, for more constrained architectures, the advantages gained by software pipelining can be significantly reduced. For example, following the generation of an instruction schedule, a process of register allocation is performed which attempts to allocate physical registers to variables. It is possible that the software pipelining may produce an instruction schedule such that the subsequent register allocation forces a variable to be stored in a memory rather than storing the variable in a register. This may be due to the architectural constraints of the particular processor being used. As accessing a memory is generally slower than accessing a register, the overall execution speed of the instructions is reduced. It would therefore be desirable to have software pipelining methods better suited to such constrained architectures.