1. Field of the Invention
Embodiments of this invention relate generally to processors, and, more particularly, to a method, system and apparatus for controlling the order of execution of operations to maximize processor performance.
2. Description of Related Art
A typical computer program is a list of instructions, which when compiled or assembled, generates a sequence of machine instructions or operations that a processor executes. The operations have a program order defined by the logic of the computer program and are generally intended for sequential execution in the program order. Scalar processors execute the operations in the program order, which limits a scalar processor to completing one operation before completing the next operation. Superscalar processors contain a variety of execution units that operate in parallel to execute and complete multiple operations in parallel. Superscalar processors can therefore be faster than scalar processors operating at the same clock speed because superscalar processors can complete multiple operations per clock cycle while scalar processors ideally complete one operation per cock cycle.
A superscalar processor typically schedules execution of operations so that operations can be executed in parallel and complete out of the normal program order. Difficulties in out-of-order execution arise because one operation may depend on another in that the logic of a computer program requires that the first operation in the program be executed before the second operation. For example, a superscalar processor that is capable of issuing and executing machine instructions out of order may permit loads to be executed ahead of stores and stores to be executed ahead of loads. This feature permits a large performance advantage provided that the load address and the store address do not both have the same physical address. In typical programs, the frequency that a load precedes the store (or a store precedes the load) and that their physical address matches is low. However, because the discovery of this type of violation is typically late in the instruction execution pipeline, the recovery penalty can be quite severe. The recovery process typically involves, flushing the execution pipeline by invalidating the load (or store) instruction that caused the violation and all newer instructions in program order beyond the load (or store) instruction, and subsequently reissuing the offending instruction.