Processors or microprocessors used in processor chips, which are also known as microchips or integrated circuits (ICs), are an integral part of computer systems, performing tasks such as executing and storing instructions. Constantly increasing performance requirements combined with competitive pressures to minimize costs result in significant challenges in processor design. Processor designers attempt, for instance, to extract more and more processing speed from their processor designs. Design changes initiated to increase speed, however, often result in increased area of silicon used by the processor design, resulting in increased costs. Processor design requires such trade-offs between performance and silicon area (and costs).
To improve performance, many processors increase the speed of processing by decoding one or more instructions while the preceding instruction is executing in a process known as pipelining. Many processors also utilize branch processing in their design. Branch processing occurs when a branch instruction implements a jump to another location of instructions, resulting in a break in the flow of instructions performed by the processor. The break in the flow of instructions caused by conditional branch processing results in difficulties in implementing pipelining, as the processor does not know which instruction to next feed into the pipeline. To solve this problem, processor designers use branch prediction where the processor attempts to predict whether the branch instruction will jump or not. The processor will then use the branch prediction to decide which instruction to load next into the pipeline in a process known as speculative execution. If the branch prediction is incorrect, the pipeline is flushed and calculations discarded, but if the branch prediction is correct, the processor has saved a significant amount of time. Static branch prediction utilizes known characteristics of the instruction to provide predictions of which branches are most likely to be taken. Processor designers also may use dynamic branch prediction to further improve performance. Dynamic branch prediction allows the hardware to change its branch predictions as the program executes. The improved performance of dynamic branch prediction becomes particularly useful for processors that attempt to issue more than one instruction per clock cycle.
In order to properly execute branch instructions, information such as the instruction address and predicted target address need be known and stored. One approach to the problem of storing the instruction address and predicted target address is to store the predicted target address along with the instruction text in the instruction buffer (also known as an I-buffer). As processors increase the size of their addressable memory, storing address information in this fashion becomes increasingly wasteful. In a 64-bit machine, for example, each predicted target address will require a full 64-bit register of storage, resulting in an undesirable use of space and silicon. As processors become increasingly sophisticated, the space requirement for the branch prediction information will become even more of an issue.
Dynamic branch prediction requires additional information to be stored, such as an indication of whether a branch was originally predicted to be taken or not taken, exacerbating the problems with using an instruction buffer. To implement dynamic branch prediction, a branch taken indication may be passed along with an instruction, regardless of whether the instruction is a branch instruction or not, potentially resulting in an even larger waste of silicon. Moreover, there is a significant latency associated with fetching instructions, dynamically predicting if branches in the fetch group were taken, and sending this information to the instruction buffer to be dispatched to the branch execution engine. This could result in the branch taken indication not being available to the branch execution mechanism in time for it to be useful.
Another approach to storing address information relating to branch operations is to store the information in a separate branch information queue. A separate branch information queue cuts down on the space requirements of the instruction buffer solution, but it still requires a significant amount of area as well as additional complexity. For processors using dynamic branch prediction and the associated additional storage requirements, the problems associated with using a separate branch information queue are worsened.
There is, therefore, a need for an effective mechanism for storing branch information in a processor that reduces the silicon area used for storage and improves latency. There is an even greater need for such a mechanism as chips become more and more powerful and branch prediction methods such as dynamic branch prediction are used by processor designers.