In typical pipelined processors, the processing of each instruction is divided into successive stages, with each stage of an instruction processing being handled by a specialized unit in a single cycle. Each successively earlier stage in the pipeline of stages is ideally handling simultaneously the successively next instruction. However, when a conditional branch instruction is encountered, there are several cycles of delay between the decoding of the branch instruction and its final execution/resolution, so it is not immediately known which instruction will be the next successive instruction. It is wasteful of the computer resource, however, to wait for the resolution of an instruction before starting with the processing of a next instruction. Therefore, it is recognized that it is advantageous to provide a mechanism for predicting the outcome of a conditional branch instruction in advance of its actual execution in order to provisionally begin processing instructions which will need to be processed if the prediction is correct. When the prediction is correct, the computer system can function without a delay in processing time. There is a time penalty only when a correct prediction cannot be attained ahead of time.
Throughout this application, the following terms and conventions will be used and shall have the indicated meaning. A branch instruction tests a condition specified by the instruction. If the condition is true, then the branch is taken, that is, following instruction execution begins at the target address specified by the branch instruction. If the condition is false, the branch is not taken and instruction execution continues with the instruction sequentially following the branch instruction. There may be branches that are unconditionally taken all the time. Such unconditional branches may simply be viewed as a special form of branches when appropriate.
A number of patents are directed to branch prediction mechanisms. For example, U.S. Pat. No. 4,370,711 to Smith discloses a branch predictor for predicting in advance the result of a conditional branch instruction in a computer system. The principle upon which the system is based is that a conditional branch instruction is likely to be decided in the same way as the instruction's most recent executions.
A simple strategy for handling branches is to suspend pipeline overlap until the branch is fully completed (i.e., resolved as taken or not taken). If taken, the target instruction is fetched from the memory. U.S. Pat. No. 3,325,785 to Stephens sets forth a static branch prediction mechanism. An improved method of this type is to perform static branch prediction by making a fixed choice based on the type of branch and statistical experience as to whether the branch will be taken. When the choice indicates that the branch is predicted to be not taken, normal overlap processing is continued on a conditional basis pending the actual branch outcome. If the choice proves wrong the conditionally initiated instructions are abandoned and the target instruction is fetched. The cycles devoted to the conditional instructions are then lost as well as the cycles to fetch the correct target instruction. However, the latter is often avoided in the prior art by prefetching the target at the time the branch is decoded.
A more sophisticated strategy is embodied in U.S. Pat. No. 3,559,183 to Sussenguth, which patent is assigned to the assignee of the present invention. It is based on the observation that the outcome of most branches, considered individually, tends to repeat. In this strategy, a history table of taken branches is constructed, which is known as a Branch History Table (BHT). Each entry in the table consists of the address of a taken branch followed by the target address of the branch. The table is a hardware construct and so it has a predetermined size. When the table is full, making a new entry requires displacing an older entry. This can be accomplished by a Least-Recently-Used (LRU) policy as in caches. When a branch is resolved as taken during execution, the history information associated with the branch is inserted into or updated in the BHT. Branch prediction and instruction prefetching are accomplished through constant search for the next taken branches in the history table. Upon final resolution/execution of a branch, any incorrect history information associated with the branch will be reset/updated properly. The major benefit of a BHT is to allow a separate branch processing unit to prefetch instructions into the instruction buffer (I-Buffer) ahead of the instruction decode stage. Such instruction prefetching into the I-buffer past predicted taken branches is possible due to the recording of target addresses for taken branches in the BHT. U.S. Pat. No. 4,679,141 to Pomerene et al, which patent is assigned to the assignee of the present invention, improves the BHT design by recording more history information in a hierarchical manner.
U.S. Pat. No. 4,477,872 to Losq et al, which patent is assigned to the assignee of the present invention, proposes a decode time branch prediction mechanism called a Decode History Table (DHT). The DHT mechanism improves the decode time static branch prediction methods of U.S. Pat. No. 3,325,785, to Stephens, by employing a hardware table to record simple histories of conditional branches. In the simplest form a DHT consists of a bit vector of fixed length. For each conditional branch instruction executed, a bit position in the DHT is derived through a fixed hashing algorithm, and the corresponding bit in the DHT records the outcome of the execution, indicating whether the branch was taken or not taken. Similar to U.S. Pat. No. 3,325,785, the DHT method allows overlap processing on a conditional basis past the decode of a conditional branch instruction if the branch is predicted, based on the DHT history, as not taken.
The common technique for the above cited branch prediction methods that are based on the dynamic histories of branches is to first record the previous outcome of branch instructions in a history based dynamic table and to then use such recorded histories for predicting the outcome of subsequently encountered branch instructions. Each branch recorded in such a history based table is recorded with either implicit or explicit information about the address of the recorded branch instruction so that the addresses of later encountered instructions can be correlated against the recorded information (i.e., by using the address of the instruction which is potentially a taken branch instruction in order to access the table for historical branch information). In order for branches to be predicted, the history table is checked for a relevant entry by correlating the address of the instruction to be predicted against the implicitly or explicitly recorded address information of recorded branch instructions. In the DHT method, the bit position in the history vector is derived through hashing from the address of the conditional branch. In the BHT approach, an instruction is predicted to be a conditional branch which is taken if there is a match of the instruction address with a taken branch address found in an entry in the BHT and the target address recorded in this found entry is predicted as the current branch target address.
Numerous variations and improvements have been proposed in implementing a BHT. For example, in U.S. Pat. No. 4,679,141 to Pomerene et al, a technique is described for recording histories by block (e.g., doubleword) addresses instead of by individual branch instruction addresses. This technique offers advantages in reducing cache fetch traffic and the possibility of identifying the outcome of multiple branches within a single block. However, through more complex tags at each BHT entry, the block recording technique still conceptually operates as in conventional BHT methods in terms of identifying taken branch history by matching the addresses of the currently concerned instructions against the recorded addresses (or more precisely matching a portion of each such address) of the branch instructions recorded in the block.
U.S. Pat. No. 3,940,741 to Horikoshi et al sets forth an information processing device for processing instructions, including branches. A cache-like route memory is provided for storing branch target addresses of a plurality of taken branch instructions and the branch target instructions (code) themselves in corresponding relationship to the branch target addresses. The route memory is referenced by the target address of a branch instruction, and the branch target instruction at the corresponding branch target address is read out. The Horikoshi et al patent utilizes the target address of the branch, which is known upon decoding of the branch, to retrieve target instruction code for decode if the target address is recorded as a previous target for a taken branch in the route memory. Such a mechanism generally requires some delay before the access to the route memory due to the address formation for the branch target.
In practical implementations for branch prediction based on histories, timing is often found to be a critical factor. History table access generally involves address calculations and slower array lookup operations. In order to efficiently search constantly for potentially taken branches in BHT type implementations also involves complexity in the recording of history entries. Considering all of the tasks which need to be accomplished in order to make a prediction and to utilize it to advantage, it is desirable for practical reasons to be able to start the prediction process with respect to a particular instruction of interest as far in advance as possible and also to achieve the prediction as far in advance as possible. Nevertheless, there is no known art that offers the capability of either making a prediction decision or even initiating the prediction decision process with respect to an instruction which is potentially a taken branch instruction prior to identifying the address of that instruction of interest. It would be desirable to be able to predict instruction branches even earlier than the point where the address of an instruction is know which has the potential of being a taken branch instruction, because it would offer an opportunity to implement and use branch prediction with simpler and less costly circuits.