1. Technical Field
The present invention relates in general to an improved data processing system and in particular to a method and system for improved branch history prediction in a data processing system. Still more particularly, the present invention relates to an improved method and system for enhanced branch history prediction in a superscalar processor system which is capable of simultaneously dispatching a plurality of instructions.
2. Description of the Related Art
Designers of modern state-of-the-art data processing systems are continually attempting to enhance the performance aspects of such systems. One technique for enhancing data processing system efficiency is the achievement of short cycle times and a low Cycles-Per-Instruction (CPI) ratio. An excellent example of the application of these techniques to an enhanced data processing system is the International Business Machines Corporation RISC System 6000 (RS/6000) computer. The RS/6000 system is designed to perform well in numerically intensive engineering and scientific applications as well as in multi-user, commercial environments. The RS/6000 processor employs a superscalar implementation, which means that multiple instructions are issued and executed simultaneously.
The simultaneous issuance and execution of multiple instructions requires independent functional units that can execute concurrently with a high instruction bandwidth. The RS/6000 systems achieves this by utilizing separate branch, fixed point and floating point processing units which are pipelined in nature. In such systems a significant pipelined delay penalty may result from the execution of a so-called "conditional branch" instruction. Conditional branch instructions are instructions which dictate the taking of a specified conditional branch within an application in response to a selected outcome of the processing of one or more other instructions. Thus, by the time a conditional branch instruction propagates through a pipeline queue to an execution position within the queue, it will have been necessary to load instructions into the queue behind the conditional branch instruction prior to resolving the conditional branch, in order to avoid run-time delays.
One attempt at minimizing this run-time delay in pipelined processor systems involves the provision of an alternate instruction queue. Upon the detection of a conditional branch instruction within the primary instruction queue, the sequential instructions following the conditional branch instruction within the queue are immediately purged and loaded into the alternate instruction queue. Target instructions for a predicted conditional branch are then fetched and loaded into the primary instruction queue. If the predicted conditional branch does not occur, the sequential instructions are fetched from the alternate instruction queue and loaded into the primary instruction queue. While this technique minimizes run-time delay, it requires the provision of an alternate instruction queue and a concomitant increase in the hardware assets required.
Another attempt at minimizing run-time delay in pipelined processor systems involves the utilization of a compiler to insert large numbers of instructions into the queue between a conditional branch instruction and the instruction which generates the outcome which initiates the conditional branch. This technique attempts to resolve the conditional branch and place the appropriate target instructions or sequential instructions into the instruction queue prior to execution of the conditional branch instruction during the delay between execution of the instruction which generates the outcome which initiates the conditional branch and the execution of the conditional branch instruction. In theory, this technique will minimize run-time delay without requiring the provision of an alternate instruction queue; however, it is often difficult to insert sufficient numbers of instructions into the queue to accomplish the necessary delay.
As a consequence, the efficiency of a data processing system may be enhanced by accurately predicting whether or not the application will branch to a specified conditional branch in response to encountering a branch instruction. One technique for predicting whether or not a particular branch will be "taken" or "not taken" is the utilization of so-called "Branch History Tables" (BHT). A Branch History Table is utilized to store the recent history of a particular branch instruction in order to accurately predict whether or not the execution of that instruction will result in a branch within an executing set of instructions. While the provision of branch history tables provides a relatively straightforward technique for minimizing run-time delay, problems exist in superscalar processor systems wherein multiple instructions are fetched within a single access. Depending upon the pipeline in such a system the branch address may not be known until the target address calculation. If the branch address is not known until the target address calculation and the branch prediction is to be done in parallel with the address calculation then branch history table access becomes quite critical.
It should therefore be apparent that a need exists for an improved method and system for predicting the outcome of a branch instruction utilizing a branch history table.