1. Field of the Invention
This invention relates to a device for executing branch prediction in an information processing system where a plurality of instructions are read out one after another from a memory and executed in sequence.
2. Description of the Prior Arts
In a pipeline processing of a large computer system, branch prediction is adopted to decrease the disorder of pipelines which occurs usually at the generation of a branch instruction. In recent years, such branch prediction has become used in a one chip micro-computer which uses a pipeline processing.
Next, a branch prediction system will be explained.
As shown in FIG. 6a, in a pipeline processing, a plurality of stages, such as a fetch stage (fe), a decode stage (de), and an execution stage (ex), are carried out with synchronous to clocks. In this processing, if the instruction in the 1000th address under fetch is a branch instruction, the destination address (for example, 2000th address) of this branch instruction is calculated at the execution stage (ex) which takes place in the third clock. As a result, two instructions in the 1001st and 1002nd addresses, that is, two instructions occurred right after the branch instruction, should be treated as delayed slot data, in order to avoid the disorder of pipelines. Accordingly, although the branch instruction is a conditional one, and therefore, the generation of branching is not sure, executable instructions, such as no-operation instructions, should be placed after the branch instruction. This fact makes programing complicated and time consuming.
To overcome the above mentioned disadvantage, conventional branch prediction devices have the following function. That is, once a branch instruction is fetched, its assignment and destination addresses are registered and stored. Accordingly, if the same instruction specified by said registered address is fetched again, the destination address of this instruction is immediately obtained from the registered and stored values.
Concretely, in case of a pipeline processing shown in FIG. 6b, the branch instruction in the 1000th address is fetched at the first clock. The destination address 2000th address) of this instruction is calculated at the execution stage (ex) in the third clock. Then, at the registration stage (ma) in the fourth clock, the 1000th address, which is the assignment address for the branch instruction, and the 2000th address, which is the destination address of this branch instruction, are registered in a branch prediction buffer comprised of a content-addressed memory (CAM).
If the instruction, which is fetched at the fourth clock and specified by said calculated destination address, is a branch instruction, its destination address (1000th address) is calculated at the sixth clock. Then, at the seventh clock, the address (2000th address) whereby the branch instruction is specified, and its destination address (1000th address) are registered.
In parallel with the above mentioned process, a retrieval of the calculated destination address (1000th address) is also started at the seventh clock in the branch prediction buffer. In this case, because the 1000th address has been already registered in the buffer, its destination address (2000th address) is expected to be found immediately.
In conventional branch prediction buffers, however, it is not possible to execute both of a retrieval and a registration simultaneously. Accordingly, in the pipeline process shown in FIG. 6b, only the registration is executed in the seventh clock. In the situation mentioned above, therefore, the destination address cannot be obtained immediately, thus lowering the process speed of the branch prediction.
The above mentioned disadvantage becomes more serious in a super Keller processor, in which a plurality of instructions are executed simultaneously and which has a high possibility for finding a branch instruction in a destination address.
As described above, in an information processing system in which instructions are read out one after another and executed in sequence, the prior branch prediction device cannot execute both of a registration and a retrieval simultaneously in every clock. As a result, branch prediction cannot work efficiently in said system, thus lowering the throughput and the processing speed of the system.