It is known that the method of pre-fetching instructions increases processing speed of a microprocessor. A group of instructions that have been stored in sequential addresses are sequentially executed in ordinary sequential computers. In contrast, in the instruction pre-fetch system, an instruction located several instructions ahead, which is expected to be used in the future, is taken out in parallel with the executing and decoding processes of the previous instruction.
In other words, an instruction, which has been preliminarily pre-fetched from a main memory or a cache, is stored in an instruction pre-fetch buffer (queue buffer) with a small capacity that enables high-speed access; thus, an attempt is made to virtually reduce a delay in the execution caused by memory access at the time of the instruction fetch.
In the conventional technique, when a branch instruction is executed, irrespective of the address of the branch end, the executions of pre-fetches and succeeding instructions are terminated, thereby flushing (clearing) the queue buffer; thus, instructions that have been stored before are nullified, and after a new pre-fetch has been made from the branch end address and the branch end instruction has been stored in the queue buffer, the execution of the instruction is resumed.
In this manner, in the conventional technique, the queue buffer is flushed before the execution of a branch instruction. Therefore, the number of pre-fetches disadvantageously increases, there is generated a disturbance in the pipeline process, and high speed can not be realized.
In Japanese Patent Application Laid-Open No. 7-73034, a comparison is made between the branch end address at the time of executing a branch instruction and the corresponding address range of the instruction located in the queue buffer, and when the branch end address is located within the corresponding address range, the instruction in the queue buffer is used without flushing the queue buffer, thereby making it possible to reduce the number of pre-fetches after the branch.
In this conventional technique, the number of pre-fetches after the branch is certainly reduced; however, since the branch instruction is dealt with as a normal non-conditional branch instruction, a complex address generation process is required for a branch after decoding the instruction, and the corresponding circuits become complex and bulky.