The present invention relates to a data processing apparatus configured to execute call and return instructions. More particularly, this invention relates to the prediction of return addresses used by the data processing apparatus when speculatively executing instructions.
It is known for a data processing apparatus to be configured to execute call instructions which cause the data processing apparatus to depart from a sequence of program instructions to execute a further sequence of program instructions before returning to the original sequence of program instructions to continue sequential program instruction execution. Indeed, such diversions from the sequential instruction program order may be nested within one another such that whilst executing a sequence of instructions resulting from a first call instruction another call instruction may be encountered leading to execution of a further sequence of instructions, and so on. At the conclusion of any sequence of instructions which have been executed as the result of a call instruction, the end of that sequence is indicated by a return instruction, in response to which the data processing apparatus needs to have reference to a return address which indicates a point in the sequence of program instructions to which it should now be returned (e.g. to the instruction following the call instruction which caused the departure from sequential program instruction execution). In order to manage these return addresses in an efficient manner, in particular when a sequence of nested calls are likely to be encountered, it is known to provide a return stack as a mechanism for storing the required return addresses. This return stack is configured such that when a call instruction is encountered, causing the data processing apparatus to divert from sequential program instruction execution to a further set of instructions, a return address associated with that call instruction (e.g. pointing to the next program instruction following that call instruction) is pushed onto the return stack. Each time a call instruction is encountered, its associated return address is pushed onto the stack. When a return instruction is encountered, a return address is popped off the stack. This enables the return addresses to be easily retrieved in the order required, i.e. in an inverted order with respect to their corresponding call instructions.
It is also known for a data processing apparatus to be configured to speculatively execute data processing instructions. For example, the data processing apparatus may begin executing instructions which are the target of a call instruction, before it is definitively known if that call instruction will be executed. In general, the data processing apparatus can speculatively execute instructions which are the target of any branch instruction (i.e. an instruction which causes a change in program flow) before it is known if that particular branch will be taken or not. The advantages of doing this are well recognised, in that more efficient data processing results, due to not having to wait for resolution of each branch instruction before the instructions which follow it can begin their passage through the pipeline. In the context of speculative instruction execution, a return stack enables the data processing apparatus to predict return addresses for use in that speculative execution and has the advantage that it can efficiently store multiple return addresses, corresponding to a deep history of speculatively executed call instructions. However a return stack also suffers from the disadvantage that when a misprediction occurs the entire return stack of return addresses is generally discarded and a revised return stack must be created with respect to resolved instructions (i.e. those for which speculative execution is known to have been correct). More targeted mechanisms for recovering the return stack in the event of speculation errors have been proposed, but these are generally relatively complex. Also, in a data processing apparatus configured to perform out-of-order instruction execution, complexities arise from handling out-of-order call/return instruction resolution, which in the prior art has required a lot of information to be transferred along the pipeline, making these approaches costly in terms of hardware usage. Some prior art approaches have also lacked accuracy. Two prior art approaches are described in the following documents:
“The effects of mispredicted-path execution on branch prediction structures”, Jourdan, S., Hsing, T.-H., Stark, J. and Patt, Y., Proceedings of Parallel Architectures and Compilation Techniques, 1996; and
“Speculative return address stack management revisited”, Vandierendonck, H. and Seznec, A., ACM Transactions on Architecture and Code Optimization (TACO) November 2008.
It would be desirable to provide an improved technique for storing return addresses for use by a data processing apparatus which is configured to speculatively execute call instructions.