This invention relates generally to branch prediction in a computer system, and more particularly to detecting and executing an implicit predicted return from a predicted subroutine in a processor.
Computer programs frequently contain subroutines that are used to perform specific tasks within the program. Such subroutines are used to enable the reuse of code in a program and reduce code duplication. When a program is executing as a stream of instructions in a microprocessor, subroutines are typically handled by a branch instruction in the instruction stream to the start of a subroutine. The processor then executes the instructions in the subroutine, and at its conclusion, the subroutine contains another branch instruction to return to the sequential instruction of its caller. Since subroutines are frequently used in computer programs, optimizing this sequence can boost the performance of a program.
Modern high performance microprocessors contain logic that maintains a direction history of recently encountered branch instructions known as a branch history table (BHT). Many processors also contain a branch target buffer (BTB), which stores branch address and target address bits associated with a given branch. This mechanism can be used to enhance the performance of executing subroutines by predicting in advance when a branch to a subroutine will occur, and predicting to where it will return. However, this mechanism does have some limitations. First, it requires two entries in the BHT/BTB, one for the branch instruction to the subroutine and one for the return from the subroutine. Second, since subroutines are often called from many locations in a program, it may frequently be the case that the return address contained in the BHT/BTB for a subroutine is incorrect, as it points to a previous time a subroutine was executed by a different portion of the program.
Therefore, it would be beneficial to improve the handling of subroutines by reducing the number of entries in the BHT/BTB and improving the accuracy of the subroutine's predicted return address. Accordingly, there is a need in the art for providing an implicit predicted return from a predicted subroutine in a processor.