I. Field of the Disclosure
The technology of the disclosure relates to branch prediction in computer systems, and more particularly to branch target buffers (BTBs) and/or branch target instruction caches (BTICs).
II. Background
Instruction pipelining is a processing technique whereby the throughput of computer instructions being executed by a processor may be increased by splitting the handling of each instruction into a series of steps, and executing the steps in an execution pipeline composed of multiple stages. Optimal processor performance may be achieved if all stages in an execution pipeline are able to process instructions concurrently without incurring a pipeline “bubble” when instruction redirection occurs. Instructions processed within an execution pipeline may include branch instructions, which redirect the flow of a program by transferring program control to a specified branch target instruction. If a branch instruction is conditional, (i.e., it is not known whether the branch will be taken until execution), branch prediction hardware may be employed to predict whether the branch will be taken based on resolution of previously executed conditional branch instructions.
In a conventional execution pipeline, instructions following a branch instruction are fetched into the execution pipeline concurrently with decoding the branch instruction. Accordingly, when a branch is predicted to be taken, the instructions that were fetched sequential to the branch instruction (i.e., the instructions that would be executed if the branch were not taken) are flushed. The correct branch target instructions are then fetched. This process is typically referred to as an instruction fetch redirect. Because the instruction fetch redirect may consume one or more clock cycles, one or more pipeline bubbles may be introduced into the execution pipeline at the point where the decode stage idles while the branch target instructions are fetched. Once introduced, a pipeline bubble propagates through subsequent stages of the execution pipeline.
To reduce the frequency of pipeline bubbles, a branch target instruction cache (BTIC) may be utilized. A BTIC stores copies of one or more branch target instructions (i.e., instruction(s) at a target address to which a branch instruction transfers program control when the branch is taken). Branch target instructions cached in the BTIC may be partially or fully decoded. The BTIC may also cache a next instruction fetch address for fetching one or more next subsequent instructions after a cached branch target instruction. The BTIC is typically consulted during the fetch stage of the execution pipeline, and provides branch target instruction(s) to one or more subsequent stages of an execution pipeline to reduce or eliminate an occurrence of a pipeline bubble introduced as a result of an instruction fetch redirect.
A BTIC entry is established for a branch instruction when the branch instruction is recognized and the branch is first taken. Consequently, when a branch instruction is encountered for the first time, a BTIC entry does not exist for the branch instruction, and a BTIC cache “miss” occurs. In the particular case of a subroutine return instruction (a specific type of branch instruction), when the subroutine return instruction is first encountered, the subroutine return instruction will always experience a BTIC cache miss. It is desirable for a BTIC entry corresponding to the subroutine return instruction to provide correct branch target instructions when the subroutine return instruction is first encountered.
Moreover, because a subroutine may be called from multiple branch instructions at different points within a program, a BTIC entry for a subroutine return instruction may frequently contain incorrect branch target instructions. For example, when a subroutine that is called from a first calling location returns, the instructions sequential to the first calling location are executed and are populated in the BTIC entry for the subroutine return instruction as branch target instructions. If the subroutine is subsequently called from a second calling location, the instructions sequential to the second calling location should be executed after the subroutine returns. However, the branch target instructions cached in the BTIC entry for the subroutine return instruction are instructions following the first calling location, not instructions following the second calling location. Thus, the subroutine return instruction's BTIC entry does not contain correct branch target instructions for the second calling location. It is desirable for the subroutine return instruction's BTIC entry to provide correct branch target instructions, even after the subroutine is called from a different calling location.