1. Technical Field
The present invention is directed to data processing systems. More specifically, the present invention is directed to a method, apparatus, and computer program product for selectively prohibiting speculative conditional branch execution.
2. Description of Related Art
Superscalar processors enable concurrent execution of instructions. Superscalar processors can be implemented within symmetric multiprocessing (SMP) systems, simultaneous multi-threading (SMT) systems, or other types of computer systems. A symmetric multiprocessing (SMP) data processing system has multiple processors that are symmetric such that each processor has the same processing speed and latency. An SMP system has one operating system that divides the work into tasks that are distributed evenly among the various processors by dispatching one software thread of work to each processor at a time. Thus, a processor in an SMP system executes only one thread at a time.
A simultaneous multi-threading (SMT) data processing system includes multiple processors that can each concurrently execute more than one thread at a time per processor. An SMT system has the ability to favor one thread over another when both threads are running on the same processor.
Known computer systems, including SMP systems, SMT systems, and other systems, typically speculatively execute conditional branch instructions in order to improve processing efficiency within the systems. A fetch engine in the processor speculates past a branch instruction in order to supply a continuous instruction stream to the decode, dispatch, and execution pipelines in order to maintain a large window of potentially executable instructions.
Instruction fetch performance depends on a number of factors. Branch prediction accuracy has been long recognized as an important factor in determining fetch performance.
Modern microprocessors routinely use a plurality of mechanisms to improve their ability to efficiently fetch past branch instructions. The prediction mechanisms allow a processor to fetch beyond a branch instruction before the outcome of the branch is known. For example, some mechanisms allow a processor to speculatively fetch beyond a branch before the branch's actual target address has been computed. These techniques use run-time history to speculatively predict what the actual target address will be. Thus, these techniques speculatively predict which instructions should be fetched by predicting what the actual target address will be.
A significant pipelined delay penalty may result from the execution of “conditional branch” instructions. Conditional branch instructions are instructions which dictate the taking of a specified conditional branch in response to a particular outcome of the processing of one or more other instructions. The conditional branch will be either taken or not taken. If the conditional branch is taken, processing will pass to a particular target address that is not the next sequential address in the code that is being processed. The instructions that are stored starting at this non-sequential target address will then be processed. If the conditional branch is not taken, processing will fall through to the next sequential address in the code. The instructions that are stored starting at this next sequential address are then processed.
Conditional branch instructions can be speculatively processed by predicting in advance whether the conditional branch instruction will be taken or not taken. If it is predicted that the conditional branch will be taken, the instructions that are stored starting at the particular non-sequential target address are speculatively executed. If it is predicted that the conditional branch will not be taken, the instructions that are stored starting at the next sequential address are speculatively executed. When the conditional branch instruction is actually resolved, it becomes known whether the conditional branch instruction will be taken. If the prediction was correct, processing continues with the speculatively executed instructions being completed. If the prediction was incorrect, the speculatively executed instructions must be flushed from the processor and the correct instructions retrieved and executed. Flushing speculatively executed instructions consumes power and processing resources unnecessarily.
With power consumption becoming an ever more critical aspect of microprocessor-based system design, mechanisms to prevent unnecessary work are of significant value to the design. Current microprocessors perform a significant amount of unnecessary work by incorrectly predicting branch instructions in which the wrong instructions past the branch are fetched, decoded, and executed only to be flushed when it is discovered that the predicted target address did not turn out to be the actual target address. Thus, the speculatively executed instructions are on the wrong path.
In modern microprocessors, branch instructions are typically predicted to be taken or not taken early in the decoding phase of instruction processing. If the actual condition on which the branch is dependent turns out to be opposite of the predicted value, the instructions which followed the branch need to be flushed and the execution of the program resumed with the correct path for the branch.
In a power sensitive environment, speculating on branches when the prediction accuracy is low is a poor trade off of performance verses power, thus lowering the performance per watt of power used. When the processor is in a simultaneous multithreaded (SMT) mode, inaccurate speculation consumes power as well as cycles that another thread in the processor could have exploited.
Branch prediction accuracy can vary widely depending upon many factors. If the consumption of power is an important factor, turning off branch prediction can reduce power substantially but may also substantially reduce performance.
Therefore, a need exists for a method, system, and computer program product for preventing speculative execution for particular types of branch instructions that tend to be unreliably predicted in order to save power and improve processor performance.