1. Field of the Invention
The present invention generally relates to a branch predication apparatus for a central processing unit (CPU), and more particularly, to a recovery apparatus for solving a branch mis-predication and a method and a CPU thereof.
2. Description of Related Art
Along with the development of semiconductor technology, computer has become one of the most indispensable tools in our daily life, and people can use computers to accomplish many tasks by executing various computer programs. A central processing unit (CPU) is the most important part in a computer. Most of high performance CPUs have a branch prediction apparatus for processing branch instructions.
Generally, there is one branch instruction in every four or five instructions of a program. Thus, the performance of a computer can be improved by using a CPU with a branch prediction apparatus. However, a branch prediction apparatus cannot always predict the next instruction to be executed precisely when the CPU processes a branch instruction. Thus, the branch prediction apparatus may produce a branch mis-predication and which may result in a loss of CPU performance. Such a loss in CPU performance is usually referred as the branch penalty.
In order to resolve foregoing problem, many academic institutions and CPU manufacturers focus on the reduction of the probability of branch mis-predications. Accordingly, many predication algorithms and branch predication architectures are provided.
However, foregoing software algorithms or hardware architectures can only reduce the probability of branch mis-predications, but branch mis-predication always occurs in a conditional branch or the final loop of a loop operation in a program. Thus, a multi-path execution CPU architecture is provided to resolve the problem of branch mis-predication.
However, the multi-path execution CPU can only process one branch mis-predication. Accordingly, a confidence estimator is required to help the multi-path execution CPU to fetch instructions on multiple paths. Besides, the multi-path execution CPU completes the instructions on two paths simultaneously. As a result, a register renaming mechanism is further required for processing data dependency and register commitment.
Presently, the multi-path execution CPU architecture is still in the theoretical stage but not implemented as a commercial hardware due to its high complexity. In addition, the multi-path execution CPU architecture can only increase about 10% of the performance according to the related documents and researches.
Besides using a multi-path execution CPU for processing multi-path executions, a multi-threading CPU which allows a complier to process multi-path executions with two threads is also adopted.
However, the multi-path execution method or architecture cannot be implemented in a deep-pipelining superscalar CPU due to its high complexity. Moreover, in order to improve the performance, a confidence estimator or the effort of a compiler is required by foregoing architecture or method to execute multi-path instructions.