1. Field of the Invention
The present invention relates to the field of computer systems. More specifically, the present invention relates to determining the next instruction pointer on an out-of-order execution computer systems.
2. Background
Typical prior computer processors implement in-order instruction execution pipelines. An in-order processor usually fetches an instruction stream from a memory, issues and executes each instruction in the instruction stream according to a program order. Typically, such an in-order processor determines the program order as the instructions are executed. An instruction pointer that specifies a next instruction in the instruction stream to be executed is continuously updated with the execution of each instruction.
Some prior processors implement a mechanism for limiting the valid range of the instruction pointer. Such an instruction pointer limitation is commonly implemented in conjunction with a mechanism for providing a relocatable base address for the instruction stream. The instruction pointer limitation provides a protected range of memory extending from the relocatable base that prevents erroneous program execution across program modules.
For example, one prior processor provides a code segment base register that defines a base address for the instruction stream, and an instruction pointer limit register that defines the valid range of the instruction pointer from the base address.
An in-order processor usually enforces such an instruction pointer limit by determining whether a new next instruction pointer exceeds the instruction pointer limit as the new next instruction pointer is determined. If the new next instruction pointer exceeds the instruction pointer limit, the processor typically performs a predefined operation to indicate an instruction pointer limit violation.
An instruction stream typically contains certain instructions that cause discontinuities in the program order. For example, jump instructions, call instructions, and return instructions may cause the processor to redirect the instruction pointer to a discontinuous location in the memory defined by a target address. Such instructions that cause discontinuities in the program order are hereinafter referred to as branch instructions.
A branch instruction may be a conditional branch instruction or an unconditional branch instruction. A conditional branch instruction may or may not cause a discontinuity in the program order. The processor typically tests the execution results of prior instructions to determine whether a conditional branch is taken. In addition, the processor may require the execution results of prior instructions to determine a target address for a conditional or an unconditional branch instruction.
A processor must resolve such discontinuities before determining the next instruction pointer and enforcing an instruction pointer limit. In-order instruction execution ensures that such discontinuities are resolved prior to the determination of the next instruction pointer and the enforcement of the instruction pointer limit.
A processor may implement an out of order instruction execution pipeline to increase instruction execution performance. Such a processor fetches an instruction stream from a memory, issues the instructions in program order, but executes the issued instructions as soon as they are ready, even if there are not ready instructions issued at an earlier time. A ready instruction is typically an instruction having fully assembled source data. The result data of the executed instructions are subsequently retired or committed to an architectural state in program order in due course.
Such out of order execution improves processor performance because the instruction execution pipeline of the processor does not stall while assembling source data for a non ready instruction. For example, a non ready instruction awaiting source data from an external memory fetch does not stall the execution of later instructions in the instruction stream that are ready to execute.
However, a processor that implements an out of order instruction execution pipeline creates complications for determining the next instruction pointer and enforcing an instruction pointer limit because such a processor generates the result data for the instructions out of order. The result data is out of order because the instructions that cause generation of the result data are executed out of order. As a consequence, such a processor cannot properly determine the next instruction pointer and perform an instruction pointer limit check as such instructions are executed.
An out-of-order execution processor may also implement a speculative instruction execution pipeline to increase instruction execution performance. A processor employing speculative instruction execution typically determines a speculative execution path through a program by predicting the outcome of conditional branch instructions. Such a processor fetches an instruction stream from a memory, predicts whether conditional branch instructions in the instruction stream will result in a branch, and continues fetching and executing the instruction stream according to the prediction. Execution results of mid-predicted branches are purged upon detection of the mis-predictions. Such speculative execution increases processor performance because the instruction execution pipeline does not stall during the resolution of conditional branch instructions.
However, a processor that implements a speculative out of order instruction execution pipeline creates further complications for determining the next instruction pointer and enforcing an instruction pointer limit, because the out-of order result data generated by such a processor for some of the instructions are speculative. The result data is speculative until the branch prediction that caused speculative execution of the instructions is resolved. In such a processor, prior discontinuities in the instruction stream may not be resolved when an instruction executes. In addition, the execution results of prior instructions that are required to determine a target address or condition of a branch instruction may not be available when the branch instruction executes. As a consequence, such a processor must overcome the speculativeness of the result data as well as their out-of-order generation to properly determine the next instruction pointer and perform an instruction pointer limit check.
An out-of-order execution processor may also implement a concurrent instruction retirement pipeline to increase instruction execution performance. A processor employing concurrent instruction retirement typically can retire a number of executed instructions concurrently, up to but not exceeding a predetermined ceiling. Such a processor examines a predetermined number of dispatched instructions at each retirement operation, determines whether any of the dispatched instructions are executed and their result data are ready for retirement or commitment to the architectural state, and retires or commits the result data accordingly. Such concurrent retirement of multiple executed instructions increases processor performance because executed instructions are retired or committed at a higher rate.
However, a processor that implements concurrent instruction retirement pipeline also creates further complications for solving the speculativeness and out-of-order generation of the result data for determining the next instruction pointer and enforcing an instruction pointer limit, because the result data of varying number of executed instructions are retired or committed during each retirement operation. As a consequence, if such a processor is to overcome the speculativeness and out-of-order generation of the result data at retirement to properly determine the next instruction pointer and perform an instruction pointer limit check, the processor must account for the varying number of retiring instructions.
Some out of order processors may fetch an instruction stream from a memory, and convert each instruction of the incoming stream into a sequence of micro-ops according to the sequential program order. A branch macro instruction is converted in a sequence of micro-ops having a corresponding branch effecting micro-op. Depends on implementations, the branch effecting micro-op might be placed in a fixed or variable location in a micro-op sequence. Such an out of order processor then issues the micro-ops in order, but executes the micro-ops according to the availability of source data and execution resources rather than the program order. The result data of the executed micro-ops are subsequently retired or committed to the architectural state in due course. Such conversion of instructions into micro-ops increases processor compatibility because the micro-ops can take advantage of hardware advancements while maintaining compatibility at the macro instruction level.
However, a processor that implements a micro instruction pipeline also creates further complications for solving the speculativeness and out-of-order generation of the result data for determining the next instruction pointer and enforcing an instruction pointer limit, because the result data are retired or committed at the micro instruction level while the next instruction pointer and the instruction pointer limit are at the macro instruction level. As a consequence, if such a processor is to overcome the speculativeness and out-of-order generation of the result data at retirement to properly determine the next instruction pointer and perform an instruction pointer limit check for the macro instructions, the processor must account for potential duplicate contributions to the next instruction pointer, and depending on implementation, the placement of branch effecting micro-op in a micro-op sequence.
A micro-op is also referred as a micro instruction in the art. However, for clarity, an instruction will be described herein as being converted into a number of micro-ops. A person skill in the art will appreciate that to the execution units, a micro-op is an "instruction". Moreover, a processor that does not expand instructions into sequences of micro-ops can be thought of as a special case where instructions are always "expanded" to "sequences" of one micro-op.