1. Field of the Invention
The present invention relates to pipelined processor architectures. More specifically, a pipelined architecture is disclosed that has a pre-fetch stage that is used to perform branch prediction, and which provides results to a separate instruction fetch stage.
2. Description of the Prior Art
Numerous methods have been developed to increase the computing power of central processing units (CPUs). One development that has gained wide use is the concept of instruction pipelines. The use of such pipelines necessarily requires some type of instruction branch prediction so as to prevent pipeline stalls. Various methods may be employed to perform branch prediction. For example, U.S. Pat. No. 6,263,427B1 to Sean P. Cummins et al., included herein by reference, discloses a branch target buffer (BTB) that is used to index possible branch instructions and to obtain corresponding target addresses and history information.
Because of their inherent complexity, the prior art branch prediction mechanisms can themselves lead to pipeline stalls. For example, the typical branch prediction stage includes instruction fetching, BTB access and hit determination, target address acquisition, and prediction mechanisms (potentially based upon history information) to generate a next instruction address. Such a large amount of work cannot always be performed in a single clock cycle, and so pipeline stalls result. These stalls greatly reduce the efficiency of the CPU, especially when executing tight loops. The above-noted invention of Cummins et al. utilizes a new BTB mechanism to avoid such pipeline stalls; however, the new BTB mechanism requires larger BTB table entries to store more information, and introduces greater complexity in the overall branch prediction design.