1. Field of the Invention
The present invention generally relates to executing instructions in a processor. Specifically, this application is related to increasing the efficiency of a processor executing branch instructions.
2. Description of the Related Art
Modern computer systems typically contain several integrated circuits (ICs), including a processor which may be used to process information in the computer system. The data processed by a processor may include computer instructions which are executed by the processor as well as data which is manipulated by the processor using the computer instructions. The computer instructions and data are typically stored in a main memory in the computer system.
Processors typically process instructions by executing the instruction in a series of small steps. In some cases, to increase the number of instructions being processed by the processor (and therefore increase the speed of the processor), the processor may be pipelined. Pipelining refers to providing separate stages in a processor where each stage performs one or more of the small steps necessary to execute an instruction. In some cases, the pipeline (in addition to other circuitry) may be placed in a portion of the processor referred to as the processor core. Some processors may have multiple processor cores, and in some cases, each processor core may have multiple pipelines. Where a processor core has multiple pipelines, groups of instructions (referred to as issue groups) may be issued to the multiple pipelines in parallel and executed by each of the pipelines in parallel.
As an example of executing instructions in a pipeline, when a first instruction is received, a first pipeline stage may process a small part of the instruction. When the first pipeline stage has finished processing the small part of the instruction, a second pipeline stage may begin processing another small part of the first instruction while the first pipeline stage receives and begins processing a small part of a second instruction. Thus, the processor may process two or more instructions at the same time (in parallel).
Processors typically provide conditional branch instructions which allow a computer program to branch from one instruction to a target instruction (thereby skipping intermediate instructions, if any) if a condition is satisfied. If the condition is not satisfied, the next instruction after the branch instruction may be executed without branching to the target instruction. Typically, the outcome of the condition being tested is not known until the conditional branch instruction is executed and the condition is tested. Thus, the next instruction to be executed after the conditional branch instruction may not be known until the branch condition is tested.
Where a pipeline is utilized to execute instructions, the outcome of the conditional branch instruction may not be known until the conditional branch instruction has passed through several stages of the pipeline. Thus, the next instruction to be executed after the conditional branch instruction may not be known until the conditional branch instruction has passed through the stages necessary to determine the outcome of the branch condition. In some cases, execution of instructions in the pipeline may be stalled (e.g., the stages of the pipeline preceding the branch instruction may not be used to execute instructions) until the branch condition is tested and the next instruction to be executed is known. However, where the pipeline is stalled, the pipeline is not being used to execute as many instructions in parallel (because some stages before the conditional branch are not executing instructions), causing the benefit of the pipeline to be reduced and decreasing overall processor efficiency.
In some cases, to improve processor efficiency, branch prediction may be used to predict the outcome of conditional branch instructions. For example, when a conditional branch instruction is encountered, the processor may predict which instruction will be executed after the outcome of the branch condition is known. Then, instead of stalling the pipeline when the conditional branch instruction is issued, the processor may continue issuing instructions beginning with the predicted next instruction.
However, in some cases, the branch prediction may be incorrect (e.g., the processor may predict one outcome of the conditional branch instruction, but when the conditional branch instruction is executed, the opposite outcome may result). Where the outcome of the conditional branch instruction is mispredicted, the predicted instructions issued subsequently to the pipeline after the conditional branch instruction may be removed from the pipeline and the effects of the instructions may be undone (referred to as flushing the pipeline). Then, after the pipeline is flushed, the correct next instruction for the conditional branch instruction may be issued to the pipeline and execution of the instructions may continue. Where the outcome of a conditional branch instruction is incorrectly predicted and the incorrectly predicted group of instructions is flushed from the pipeline, thereby undoing previous work done by the pipeline, the efficiency of the processor may suffer.
Accordingly, what is needed is an improved method and apparatus for executing conditional branch instructions and performing branch prediction.