I. Field of the Disclosure
The technology of the disclosure relates generally to use of branch direction history to predict resolutions of conditional branches for conditional branch computer instructions in central processing unit (CPU)-based systems.
II. Background
At the heart of the computer platform evolution is the processor. As the physical design of the processor has evolved, methods of processing information and performing functions have also changed. For example, “pipelining” of instructions has been implemented in processor designs. A processor pipeline is composed of many stages, where each stage performs a function associated with executing an instruction. Each stage is referred to as a pipe stage or pipe segment. The stages are connected together to form the pipeline. Instructions enter at one end of the pipeline and exit at the other end. One advantage of pipelining is that the execution of the instructions is overlapped because the instructions are evaluated in parallel. Pipelining is also referred to as instruction level parallelism (ILP).
In this regard, FIG. 1 illustrates an exemplary instruction processing system 10 of a central processing unit (CPU) 12. Instructions are processed in a continuous flow represented by an instruction stream 14 in FIG. 1. The instruction processing system 10 employs an instruction pipeline 15. The instruction pipeline 15 is comprised of a plurality of pipe stages, including instruction fetch, instruction decoding, instruction execution, and instruction commit stages. In the illustrated example, the instruction stream 14 originates from instruction memory 16, which provides storage for instructions of a computer-executable program. An instruction fetch circuit 18 reads an instruction 20 (e.g., instructions 20(0)-20(W)) from the instruction memory 16 and/or from an instruction cache 22, and may increment a program counter, typically stored in one of registers 24(0)-24(X). The registers 24(0)-24(X) are architectural registers of the instruction processing system 10, which may include general purpose registers (GPRs) and/or other architected registers (as non-limiting examples, a frame pointer, a stack pointer, a link register, and/or a program counter).
After an instruction 20 is fetched by the instruction fetch circuit 18, the instruction 20 is decoded by an instruction decode circuit 26. The instruction decode circuit 26 translates the instruction 20 into processor-specific microinstructions, and retrieves operands required by the instruction 20 (if any) from the appropriate one of the registers 24(0)-24(X), or from a data memory (not shown) and/or a data cache (not shown). The instruction decode circuit 26 may hold a set of multiple instructions 28(0)-28(Y) for decoding. The instructions 20 are issued into an instruction queue 30 of instruction execution pipeline(s) 32. Actual execution of the instructions 20 takes place in an instruction execution pipeline 32 (e.g., instruction execution pipelines 32(0)-32(Z)). An instruction commit circuit 34 is provided that determines which of the executed instructions 20 are needed and commits those results, for example by updating the registers 24 (as a non-limiting example, registers 24(0)-24(X)), the data memory, and/or the data cache with the results of the executed instructions 20.
The instructions 20 may include conditional branch instructions. Conditional branch instructions may be taken or not taken. It is not known whether a conditional branch instruction will be taken until the conditional branch instruction is executed and the branch condition is determined. However, instructions beyond a conditional branch instruction may be fetched into the instruction pipeline 15 prior to executing the conditional branch instruction. For example, if a branch is taken, instructions 20 fetched into the instruction pipeline 15 to be executed if the branch were not taken may have to be flushed from the instruction pipeline 15. As a result, instruction processing may be delayed by the number of clock cycle stages in the instruction pipeline 15, to refill the instruction pipeline 15.
To reduce instruction flushing of the instruction pipeline 15, a branch prediction system 36 may be employed in the instruction processing system 10. A branch prediction system 36 predicts the direction of conditional branch instructions. In this regard, the branch prediction system 36 provides a branch prediction 40 predicting the direction of a conditional branch instruction based on a history of committed branch instructions 38. The branch prediction 40 is provided to an instruction processing circuit 42 (as non-limiting examples, the instruction fetch circuit 18 and/or the instruction decode circuit 26) of the instruction processing system 10. Based on the branch prediction 40, the processor may either fetch instructions 20 at the branch target address of the conditional branch instruction into the instruction pipeline 15, or fetch next sequential instructions 20 into the instruction pipeline 15. However, branch predictions provided using conventional methods may not be as accurate as desired.