1. FIELD OF THE INVENTION
This invention relates in general to the field of instruction execution in computers, and more particularly to an apparatus and method for predicting the outcome of branch instructions in a pipeline microprocessor.
2. DESCRIPTION OF THE RELATED ART
An application program for execution on a microprocessor consists of a structured series of macro instructions that are stored in sequential locations in memory. A current instruction pointer within the microprocessor points to the address of the instruction currently being executed and a next instruction pointer within the microprocessor points to the address of the next instruction for execution. During each clock cycle, the length of the current instruction is added to the contents of the current instruction pointer to form a pointer to a next sequential instruction in memory. The pointer to the next sequential instruction is provided to logic that updates the next instruction pointer. If the logic determines that the next sequential instruction is indeed required for execution, then the next instruction pointer is updated with the pointer to the next sequential instruction in memory. Thus, macro instructions are fetched from memory in sequence for execution by the microprocessor.
Obviously, because a microprocessor is designed to execute instructions from memory in the sequence that they are stored, it follows that a program configured to execute macro instructions sequentially from memory is one that will run efficiently on the microprocessor. For this reason, most application programs are designed to minimize the number of instances where macro instructions are executed out of sequence. These out-of-sequence instances are known as jumps, or branches.
A program branch presents a problem because most conventional microprocessors do not simply execute one instruction at a time. Rather, a present day microprocessor consists of a number of pipeline stages, each stage performing a specific function. Instructions, inputs, and results from one stage to the next are passed in synchronization with a pipeline clock. Hence, several instructions may be executing in different stages of the microprocessor pipeline within the same clock cycle. Consequently, when logic within a given stage determines that a program branch is to occur, then instructions in previous stages of the pipeline must be cast out so that control of the microprocessor can be transferred to the instruction directed by the branch, or the branch target instruction. This casting out of instructions in previous pipeline stages is known as flushing the pipeline.
A conditional branch is a branch that may or may not occur, depending upon an evaluation of some specified condition. And this evaluation is typically performed in later stages of the microprocessor pipeline. To preclude wasting many clock cycles associated with flushing and refilling the pipeline, present day microprocessors also provide logic in an early pipeline stage that predicts whether a conditional branch will occur or not, that is, whether it will be taken or not taken. If it is predicted that a conditional branch will be taken, then only those instructions prior to the early pipeline stage must be flushed, including those in the instruction buffer. Even so, this is a drastic improvement; correctly predicted branches are executed in roughly two clock cycles. But an incorrect prediction takes many more cycles to execute than if no branch prediction mechanism had been provided in the first place. The accuracy of branch predictions in a pipeline processor significantly impact the processor's performance, for better or worse.
Present day branch prediction mechanisms primarily predict the outcome of a given conditional branch instruction in an application program based upon outcomes obtained when the conditional branch instruction was previously executed within the same instance of the application program. This historical branch prediction, or dynamic branch prediction, is effective because conditional branch instructions tend to exhibit repetitive outcome patterns.
In a conventional microprocessor, the historical outcome data is stored in a single branch history table that is accessed using the address of a conditional branch instruction—a unique identifier for the instruction. A corresponding entry in the branch history table contains the historical outcome data associated with the conditional branch instruction. A dynamic prediction of the outcome of the conditional branch instruction is made based upon the contents of the corresponding entry in the branch history table.
Yet, because most present day microprocessors have address ranges on the order of gigabytes, it is not practical for a branch history table to be as large as the microprocessor's address range. Because of this, smaller branch history tables are provided, on the order of kilobytes, and only the low-order bits of a conditional branch instruction's address are used as an index into the table. But this presents another problem: because low-order address bits are used to index the branch history table, two or more conditional branch instructions can index the same entry. This is known as aliasing. As such, the outcome of a more recently executed conditional branch instruction will influence the historical outcome record of a formerly executed conditional branch instruction that is aliased to the same table entry. If the former conditional branch instruction is encountered again, its historical outcome information is biased, for better or for worse, toward the outcome of the more recently executed conditional branch instruction.
The present inventors have observed that the outcomes of conditional branch instructions, when observed on pipeline microprocessor executing today's predominant desktop computer application programs exhibit a bias toward one outcome or the other as a function of static indicators such as the type of conditional test performed, regardless of historical outcome data associated with the instructions. Some conditional branch instructions indeed exhibit a very strong bias toward an outcome, virtually independent of execution history in the application program. Thus, execution of an application program is negatively impacted when it has a significant number of conditional branch instructions having conflicting outcome biases that are aliased to the same historical outcome records: predictions for instructions biased toward being taken are negatively influenced by instructions that are biased toward being not taken that are aliased to the same historical outcome record, and vice versa.
Thus, the accuracy of branch predictions is degraded on the whole in a microprocessor that allows the outcomes of conditional branch instructions exhibiting a certain outcome bias to impact the historical outcome data for conditional branch instructions that exhibit a conflicting outcome bias.
Therefore, what is needed is an apparatus for predicting the outcomes of branch instructions that is more accurate than has heretofore been provided.
In addition, what is needed is a branch prediction mechanism in a microprocessor that separately maintains historical outcome records for conditional branch instructions as categorized by outcome bias.
Furthermore, what is needed is an apparatus in a microprocessor for predicting branches that precludes outcomes associated with conditional branch instructions having a certain outcome bias from influencing branch predictions for other branch instructions that exhibit a bias toward a different outcome.
Moreover, what is needed is a method for separating historical outcome data for conditional branch instructions that improves the accuracy of their associated branch predictions.