1. Field of the Invention
The present invention generally relates to methods of predicting branching of branch instructions and processors employing such methods, and particularly relates to a method of predicting branching of branch instructions based on PHT (pattern history table) and a processor employing such a method.
2. Description of the Related Art
In processors based on pipeline operations, waiting for branch results to be known before jumping to branch addresses results in the delay of instruction fetch timing, causing disturbance in the pipeline operation. It is thus necessary to predict branching before actually executing branch instructions, thereby making it possible to fetch instructions in a continuous stream in accordance with the pipeline operations.
Branch instructions often have locally lopsided tendency in branch directions such that a given branch direction, of itself, is likely to branch or is not likely to branch. Further, branch directions may often have globally lopsided tendency in branch directions such that a given branch instruction is likely to branch or is not likely to branch depending on the branching results of recently executed branch instructions. A PHT (pattern history table) provides a highly accurate prediction by taking into account the local tendency and the global tendency in branch directions.
FIG. 1 is a block diagram showing a configuration of a related-art branch prediction mechanism based on the PHT.
The branch prediction mechanism of FIG. 1 includes an XOR circuit 11, a GHR unit 12, and a PHT unit 13. The GHR (global history register) unit 12 is a register that stores therein the history of recently executed branch instructions as to whether or not they branched. When a given branch instruction branches, the contents of the register is shifted one bit to the left, with “1” being inserted into the least significant bit. When a given branch instruction does not branch, the contents of the register is shifted one bit to the left, with “0” being inserted into the least significant bit. For example, the GHR unit 12 may be 6 bits in length, and the current contents thereof are “011001”. If the execution of a given branch operation results in branching, the contents of the GHR unit 12 is shifted one bit to the left, and “1” is inserted into the least significant bit. As a result, the contents of the GHR unit 12 in this case becomes “110011”. “110011” indicates branching taking place 6 branch instructions ago, branching taking place 5 branch instructions ago, no branching taking place 4 branch instructions ago, no branching taking place 3 branch instructions ago, branching taking place 2 branch instructions ago, and branching taking place for the last branch instruction.
The XOR circuit 11 performs an Exclusive-OR operation between the contents of the GHR unit 12 and a branch instruction address that is indicated by a program counter 10 as a next instruction to be executed. The obtained Exclusive-OR value is supplied to the PHT unit 13 as an index.
The PHT unit 13 is a RAM (random access memory) that stores therein a count value for each index where the count value may be comprised of 2 bits, for example. Each index is an Exclusive-OR value between the contents of the GHR unit 12 and a branch instruction address indicated by the program counter 10. The 2-bit count value that is an entry corresponding to each index is a prediction used when the corresponding index is hit. When the count value is 0 or 1, branching is predicted. When the count value is 2 or 3, no branching is predicted.
If the contents of the GHR unit 12 are “110011”, and the branch instruction address is “001000”, for example, the index will be “111011”. The 2-bit count value corresponding to this index “111011” is referred to, and may be found to be 2, for example. Since the count value being 2 or 3 indicates branching as described above, the branch instruction at the instruction address “001000” to be executed is expected to branch according to the prediction. If the instruction actually branches as a result of actual execution thereof, the count value is incremented by 1. If the instruction does not branch as a result of actual execution thereof, the count value is decreased by 1. Accordingly, the count value will be 3 in the case of actual branching of the instruction.
After this, the branch instruction at the same branch instruction address “001000” may be executed again while the GHR unit 12 has the same contents “110011” as before. In this case, the index will be “111011”, which is the same as before. Since the count value is 3, it is predicted that the branch instruction will branch. As previously described, the contents of the GHR unit 12 are the history of outcomes of recently executed branch instructions. As a result, if the same branch instruction is executed under the same conditions of recent branch outcomes, the branch result is accumulated in the same index. When the same index is referred to on a next occasion, the count value accumulated in this manner will be used for branch prediction.
The contents of the GHR unit 12 may be “110010”, for example, illustrating a case in which the history of outcomes of recently executed branch operations is slightly different from the history of the previous example. This corresponds to a case in which the outcome of the last branch instruction is different from “110011”. When the branch instruction at the same branch instruction address “001000” is to be executed, the index will be “111010”. In this manner, this index will have branch outcomes accumulated therein when the branch instruction at the branch instruction address “001000” is executed under the previous branch conditions that are indicated as the history “110010”.
Accordingly, if only one branch instruction is present in a program, indexes will accumulate the outcomes of this single branch instruction with respect to respective branch histories. This achieves a highly accurate prediction by taking into account each one of the branch histories. If more than one branch instruction is present in a program, however, the outcomes of different branch instructions interfere with each other in the PHT unit 13, thereby degrading prediction accuracy. For example, if the contents of the GHR unit 12 are “111010” and the branch instruction at the branch instruction address “000001” is to be executed, the index will be “111011”. This index is identical to the index that is used when the contents of the GHR unit 12 are “110011” and the branch instruction address is “001000”. In this manner, the method of calculating an index by the XOR circuit 11 results in the shared use of an index by different branch instructions, which results in interference between records of branch outcomes and a resulting degradation of prediction accuracy.
In order to avoid the degradation of prediction accuracy, indexes may be generated by combining the contents of the GHR unit 12 and the contents of the program counter 10. If the contents of the GHR unit 12 is “110011” and the branch instruction address is “001000”, for example, the index is generated as “110011001000”. In such a configuration, however, the number of entries in the RAM of the PHT unit 13 greatly increases. As a matter of fact, the number of entries in this example increases 64 times (=26).
In the configuration in which entries in the PHT used for branch prediction interfere with each other as described above, the accuracy of branch prediction undesirably decreases. It is undesirable, however, to excessively increase the memory volume of the PHT for the purpose of improving the prediction accuracy. A desirable configuration is that which enhances the prediction accuracy as much as possible with as small a memory volume as possible.
Accordingly, there is a need for a method of and an apparatus for predicting branching based on the PHT that improves prediction accuracy as much as possible with as small a memory volume as possible by avoiding entry interference.