1. Field of the Invention The present invention relates to a processor stack and, more particularly, to a stack and stack operating method for processors that engage in speculative execution of instructions.
2. Description of the Related Art
Processors generally process a single instruction of an instruction set in several steps. Early technology processors performed these steps serially. Advances in technology have led to pipelined-architecture processors, called scalar processors, which perform different steps of many instructions concurrently. A "superscalar" processor further improves performance by supporting concurrent execution of scalar instructions. In a superscalar processor, instruction conflicts and dependency conditions arise in which an issued instruction cannot be executed because data or resources are not available. For example, an issued instruction cannot execute when its input operands are dependent upon data calculated by other instructions that have not yet completed execution.
Superscalar processor performance is improved by continuing to decode instructions regardless of the ability to execute instructions immediately. Decoupling of instruction decoding and instruction execution requires a buffer, called a lookahead buffer, for storing dispatched instruction information used by the circuits, called functional units, which execute the instructions.
The buffer also improves the processor's performance of instruction sequences that include interspersed branch instructions. Branch instructions impair processor performance because instructions following the branch commonly must wait for a condition to become known before execution can proceed. A superscalar processor improves branching performance by "speculatively" executing instructions, which involves predicting the outcome of a branch condition and proceeding with subsequent instructions in accordance with the prediction. The buffer is implemented to maintain the processor's speculative state. When a misprediction occurs, the results of instructions following the mispredicted branch are discarded. A superscalar processor's performance is greatly enhanced by a rapid recovery from a branch misprediction and restart of an appropriate instruction sequence. Recovery methods cancel effects of improperly performed instructions. Restart procedures reestablish a correct instruction sequence.
One recovery and restart method, taught by Mike Johnson in Superscalar Processor Design, Englewood Cliffs, N.J., Prentice Hall, 1991, p. 92-97, employs a reorder buffer and a register file. The register file holds register values generated by retired operations--operations that are no longer speculative. The reorder buffer holds speculative results of operations--results of operations that are executed in a sequence following a predicted but unverified branch. The reorder buffer operates as a first-in-first-out queue. When an instruction is decoded, an entry is allocated at the tail of the reorder buffer. The entry holds information concerning the instruction and the result of the instruction when it becomes available. When an entry that has received its result value reaches the head of the reorder buffer, the operation is retired by writing its result to the register file. The reorder buffer is used by a processor during recovery after a branch misprediction to discard register values made by instructions that follow a mispredicted branch. Although a reorder buffer effectively restores registers following a mispredicted branch, other processor registers may need to be restored as well. For example, in a processor that employs a stack for managing data, the stack requires restoration. Stack restoration requires recovery of all stack elements, including array elements and pointers.
One example of a stack is the floating point unit (FPU) register stack of the Pentium.TM. microprocessor, available from Intel Corporation of Santa Clara, Calif. The FPU register stack is an array of eight multiple-bit numeric registers that stores extended real data. FPU instructions address the data registers relative to the top of the stack (TOS). A floating point exchange (FXCH) instruction in the Pentium.TM. microprocessor exchanges contents of the top of the stack with the contents of a specified stack element, for example a default element at the penultimate location of the stack relative to the TOS. The FXCH instruction is useful because Pentium.TM. floating point instructions generally require one source operand to be located at the top of the stack and, most frequently, the result of a FPU instruction is left at TOS. Most FPU instructions require access to the TOS, so it is desirable to manipulate data positions within the stack using the FXCH instruction.
The top of the stack is identified by the TOS pointer. Stack entries are pushed and popped by the execution of some floating point instructions and data load and store instructions. Since these instructions depend on programming of the processor, floating point overflows and underflows can occur and must be trapped, generating an exception condition. An exception condition, like a mispredicted branch, requires restoration of the speculative state of the processor.
One consequence of the FXCH instruction is that it introduces variability into the order of stack elements which complicates restoration of the stack following a mispredicted branch or exception.
In a superscalar processor, mispredicted branches and exceptions occur, making effective recovery and restart procedures desirable. What is sought are a stack and method of operating a stack for simply and rapidly restoring the state of a stack.