1. Field of the Invention
The present invention relates to a microprocessor, e.g., a superscalar microprocessor which includes BTB (Branch Target Buffer) for branch prediction; and, more particularly, a branch prediction apparatus and method that efficiently fetch the target address of a branch instruction, constructing the BTB entry into single BTB entry based on the branch instruction, and accessing BTB before the branch instruction fetch.
2. Description of the Prior Art
In the latest high performance microprocessor which contains a superscalar structure, it is structured to adapt an instruction pipe-line to increase its performance. As stages of the pipe-line are increased, the execution cycle of a microprocessor is shortened so that the performance can be enhanced. But since the branch penalty, due to the increase of a branch delay cycle during the execution of the branch instructions, is increased, the whole performance of the processor is decreased. To reduce the branch penalty, static methods employing a software and dynamic methods employing a hardware has been generally suggested. The conventional static method can't reduce the branch penalty sufficiently, and can't keep the compatibility of software, whereas the cost of the hardware is low. On the other hand, although the realization cost is high, the dynamic method is employed in a current process, e.q., Pentium manufactured by Intel Co., in that it can keep the compatibility of software and decrease the branch penalty cycle sufficiently.
BTB operates as an independent branch instruction cache which stores an instruction pointer (hereinafter referred to IP) of an instruction and IP of a predicted branch target instruction. Therefore, it is possible to fetch the predicted branch object instruction during the branch delay cycle, referring to these Ips, and this can complete the execution of the branch instruction within the unit pipe-line cycle in the case that the prediction of a branch course is same as the actual execution result.
Also, in the latest high performance microprocessor, in order to fetch the target instruction more rapidly, during the access of BTB, they access BTB based on a previous IP in stead of the IP of the branch instruction. However, the problem of the conventional method is that, in the superscalar microprocessor which execute fetching several instruction to a single cycle, the IP, which accesses BTB according to the change of the instruction fetch sequence, is changed so that BTB entries are invalid.