The present invention relates to a method and apparatus for increasing efficiency of executing computer programs and in particular for moving object language instructions to reduce stalled cycles.
When a compiler is used to produce object code from a source code program, a portion of the compiler, known as the instruction scheduler, establishes the order in which at least some of the instructions will be performed. At least some types of schedulers will schedule instructions in an order different from the apparent order the instructions had prior to the work by the scheduler. One of the purposes of changing the order of instructions is to reduce the occurrence of stalled cycles such as unfilled delay slots. Delay slots occur following instructions which require several cycles to complete. In typical computers, multiply and divide operations are examples of operations which may take multiple cycles to complete. In many devices the execution apparatus such as the arithmetic logic unit (ALU) could be idle during the delay slots, e.g. if the instruction following the instruction which caused the delay slots depends on the previous instruction (i.e., requires addresses, data or other information provided by the previous instruction). Some types of schedulers will attempt to fill one or more of these delay slots. The scheduler will fill the delay slots by identifying an instruction (typically a later instruction) which does not depend from the instruction that caused the delay slots. This "independent" instruction (i.e., an instruction which is not dependent from the instruction that caused the delay slots) can be worked on by the execution unit during the delay slot, thereby filling one of the delay slots. Since idle cycles of the execution unit represent inefficiency, the number of unfilled delay slots should be reduced or minimized in order to increase the efficiency with which the computer program is executed.
The process by which delay slots are filled must be carefully designed to avoids moving instructions in a manner that changes the result of the computer program. Also, the scheduler must not be so complex that the time required for compilation outweighs the benefits from increased efficiency of execution. Many types of schedulers move instructions only within a "basic block" section of code (a section of linear code without loops or branches, i.e., with a single entrance and single exit point). In previous devices and processes, there have often been a number of unfilled delay slots because the scheduler was unable to identify sufficient independent instructions within a given basic block to fill all delay slots. As noted above, this led to some inefficiency in the execution of programs.