This invention relates to apparatus and methods for dispatching instructions in a processor, and to such a processor. In particular, the present invention relates to the handling and issue of branch instructions.
In a pipelined processor, there is a penalty for executing control-flow (branch) instructions. In particular, for conditional branches where the value of a condition is not known at the time of instruction issue, either the issue must be stalled until the information becomes available, or the instruction must be issued speculatively based on an assumed value.
Many different approaches to the handling of conditional branch instructions are known in the prior art.
A general description of pipeline processing architectures and the handling of branch instructions is to be found, for example, in "Advanced Computer Architectures--A Design Space Approach", by Messrs D Sima, T Fountain and P Kacsuk, published by Addison Wesley Longman Limited in 1997 (ISBN 0-201-42291-3). Various aspects of parallel processing architectures are described. These include parallel processing architectures including multiple execution units and parallel decoding of instructions in, for example, superscalar processors, as well as aspects of dependency checking, etc., associated therewith. Chapter 8 of that book on pages 295-368 is directed to the processing of control transfer instructions. The handling of unresolved conditional branches is discussed. Three basic approaches are identified, namely blocking branch processing, speculative branch processing, and multiway branch processing.
Blocking branch processing is a trivial approach to cope with unresolved conditional branches whereby, on detection of a conditional branch, the conditional branch is simply stalled until the specified condition can be resolved. Although this approach is simple to implement, it is inefficient because of the stalling of processing until the resolution of the condition on which a branch is based.
With speculative branch processing, on detection of an unresolved conditional branch, a guess is made as to the outcome of the condition and execution continues speculatively along the guessed path. If it is subsequently determined that the correct guess was made, the speculative execution can be confirmed and then continued. However, if an incorrect guess was made, all of the speculatively executed instructions have to be discarded and execution restart along the correct path. This approach offers higher performance than blocking branch processing. However, there is still a penalty to be paid when an incorrect guess is made due to the need to restart processing along the correct path. Various approaches are used to make the "guess" as to which path to execute speculatively following an unresolved conditional branch.
The simplest approach is to employ a fixed prediction, whereby the same guess is always made, either taking the branch path, or not taking the branch path. This unsophisticated approach makes use of the time during resolution of the condition on which the branch is to be based by speculatively executing instructions, but makes no attempt to assess the relative merits of the individual paths. A more sophisticated approach is to make a true prediction, either in a static manner on the basis of the object code to be executed, or dynamically on the basis of an execution history. Although the use of true prediction improves the chance of selecting the correct path for speculative execution, it does not overcome the problem of having to execute the alternative path from the branch point when an incorrect guess is made.
Multiway branch processing overcomes the performance disadvantages described with respect to speculative branch processing at the cost of duplicating the instruction issue and execution hardware. In other words, multiple sets of instruction issue, dispatch and execution units, and typically multiple instruction buffers, are provided in order to enable multiple instruction sequences following a branch to be executed in parallel during resolution of the condition for the branch. On resolution of the condition, the processing of the path not required is simply halted and the processing of the path determined by resolution of the condition is then proceeded with. Although the multiway branch processing does overcome the performance disadvantages of the speculative branch processing described above, it requires an extensive investment in hardware. Also, where multiple branch instructions occur in close proximity within a code sequence, mere duplication of the necessary hardware may be insufficient in order to provide a significant performance enhancement and accordingly more than two sets of instruction issue, dispatch and execution units may be required.
Accordingly, an object of the present invention is to mitigate the disadvantages of speculative branch processing without requiring the additional investment in hardware required by multiway branch processing.