Programmers quite often use subroutines to structure their programs and to avoid writing repeated tasks several times. A subroutine is invoked from the main program by a call instruction and ends with a return instruction which brings the control back to the instruction in the main program immediately following the call instruction. Since the same subroutine is often called from several locations in the program, the return instruction will have several locations to which it has to return dependent on where the subroutine was called from. A return instruction is basically a branch instruction which in this case has several targets. If the branch is to be predicted, it is not the taken or not taken prediction which is relevant but the determination of the correct target address for time critical use at execution time. In an instruction set architecture with dedicated call and return instructions, a return stack is used to predict the target address of the return instruction. In other instruction set architectures which do not use dedicated call and return instructions additional hardware is needed to predict the return target address.
Webb, "Subroutine Call/Return Stack", IBM Technical Disclosure Bulletin, Vol. 30, No. 11, Apr. 1988, pages 221-225, discloses the arrangement of two stacks one of them used for the target address of a call instruction and the other for the next sequential instruction of the same call instruction. Whenever a branch is found in the instruction stream which is a potential call, its target address and next sequential address are written onto the stacks. If a potential return instruction is encountered in the instruction stream, its target address is compared to all addresses stored in the next sequential stack and if one entry having an equal address was found, the potential return address becomes an identified return by writing a return mark bit into the related entry of the branch history table (BHT). In addition, the target address of the initiating call is also stored in this entry of the BHT. The next time if a BHT lookup is made for obtaining the return address, the target address is not directly taken from the BHT but rather the address from the BHT is used to search the call/return stack target part. If a match is found, the entry from the corresponding next sequential instruction stack is taken as the target address of the return. The operation requires at least two cycles to define the target address of a return, where one of the cycles includes an associative search in the stack.
Furthermore, U.S. Pat. No. 5,276,882 (Emma et al) discloses a subroutine return through a branch history table which includes two additional fields consisting of a call field to indicate that the branch entry corresponds to a branch that may implement a subroutine call, and a pseudo field which represents linkage information and creates a link between a subroutine entry and a subroutine return. If the target address of a potential return is sent to the BHT and a call instruction is found immediately preceding this address, a pseudo call/return pair is established by storing a pseudo entry at the target address of the call instruction which is the subroutine start. The pseudo entry contains the address of the return instruction as its target address. If the subroutine is called, the BHT entry is looked up and the pseudo entry is found. Detecting such an entry invokes an update process which changes the target address of the return instruction to the next sequential address of the call. Now when at execution time a BHT lookup for the return is performed the correct target is found.
The location of the pseudo entry at the start address of the subroutine limits this subroutine return approach to single exit subroutines. The pseudo entry also requires additional space in the BHT. Furthermore, the block structure of the BHT is such that it will not always be possible to find the preceding call instruction.