Modern high-frequency microprocessors are typically deeply pipelined devices. For efficient instruction execution in these machines, instructions are fetched and executed speculatively. In other words, a prediction is made as to the future need of a given instruction and that instruction is then fetched into the instruction pipeline many cycles before its predicted execution. Later, when the instruction is required, it is already available in the pipeline and can be immediately executed, otherwise, the instruction is flushed and the machine retrieves the appropriate instruction from the instruction cache.
Often there are one or more branches ( some of which may be subroutine calls and returns) between the instructions that are being fetched and the instructions that are being executed in the processor execution units. Therefore, to handle subroutine calls and returns efficiently, many high frequency microprocessors employ a link stack. On a subroutine call, the address of the following instruction is “pushed” into the stack while on a subroutine return, the contents at the top of the stack (which is expected to contain the address of the instruction following the original subroutine call) are “popped” from the stack. Since pushing and popping from a hardware stack can normally be done when the branch is fetched, which occurs several cycles before the corresponding branches are executed in a deeply pipelined processor, such a linked stack mechanism helps implement the instruction fetching scheme across subroutine calls and returns to a great extent. Notwithstanding, the link stack can become corrupted during the process of speculative instruction fetching and execution.
Consider, for example, the case where a subroutine call is performed using a “branch and link instruction” and a return from the subroutine is achieved using a “branch to link register” or “BrLR” instruction. It may happen that a BrLR instruction, which for example returns to a location “A”, is fetched speculatively followed by a speculative fetch of a “branch and link” instruction, for example from call-site B. The link stack is updated at fetch time, such that after these instructions are fetched, the address location “A” is replaced by the address location “B+4” (each instruction consisting of four bytes, for example) at the top of the link stack. Since both the BrLR and “branch and link” instructions are speculatively fetched, they may not ultimately be in the actual execution path. If these instructions are not in fact in the actual execution path (in which case the instructions are flushed out), the link stack becomes corrupted.
Generally, anytime one or more BrLR instructions are followed by one or more “branch and link” instructions in the speculated path, the link stack becomes corrupted if the speculation turns out to be wrong. For a commercial programming workload, about 2% of the instructions are BrLR instructions and therefore it becomes very important to be able to predict the target address for these instructions with a good degree of accuracy in deeply pipelined machines. Thus, there exists a need for circuits, systems and methods to detect link stack corruption, as well as to recover a link stack from a corrupted condition. Since methods already exist to deal with mis-predictions in speculative instructions, the circuits, systems and methods used to deal with link stack corruption in these cases are not put in place to insure correct functional behavior, but rather, to improve execution speed. Various degrees of link stack corruption may occur on mis-predictions in speculative instruction execution and the better the recovery the less system speed will be degraded.