1. Field of the Invention
This invention is related to the field of microprocessors, and more particularly, to performing data-speculative execution in a microprocessor.
2. Description of the Related Art
Superscalar microprocessors achieve high performance by executing multiple instructions concurrently and by using the shortest possible clock cycle consistent with their design. However, data and control flow dependencies between instructions may limit how many instructions may be issued at any given time. As a result, some microprocessors support speculative execution in order to achieve additional performance gains.
One type of speculation is control flow speculation. Control flow speculation predicts the direction in which program control will proceed. For example, branch prediction may be used to predict whether a branch will be taken. Many types of branch prediction are available, ranging from methods that simply make the same prediction each time to those that maintain sophisticated histories of the previous branches in the program in order to make a history-based prediction. Branch prediction may be facilitated through hardware optimizations, compiler optimizations, or both. Based on the prediction provided by the branch prediction mechanism, instructions may be speculatively fetched and executed. When the branch instruction is finally evaluated, the branch prediction can be verified. If the prediction was incorrect, any instructions that were speculatively executed based on the incorrect predication may be quashed.
Another type of speculation that has been proposed is data speculation. For example, value prediction, which predicts the value of data items, may involve observing patterns in data and basing the prediction on those patterns (e.g., an index counter variable's value may be predicted by observing how prior values of that variable are incremented or decremented). Address prediction involves predicting the location of data. Yet another type of data speculation is called memory system optimism. In multiprocessor systems, memory system optimism occurs when a processor speculatively executes an instruction using data from that processor's local cache before coherency checking is complete. Similarly, another type of data speculation may allow a load to speculatively execute before a store that has an uncomputed address at the time the load executes, even though the store may store data to the same address that the load accesses. In all of these types of data speculation, the underlying conditions are eventually evaluated, allowing the speculation to be verified or undone. If the speculation ends up being incorrect, the instructions that executed using the speculative data may be re-executed (e.g., with updated and/or non-speculative data).
Since speculation allows execution to proceed without waiting for dependency checking to complete, significant performance gains may be achieved if the performance gained from correct speculations exceeds the performance lost to incorrect speculations. Accordingly, it is desirable to be able to perform data speculation in a microprocessor and to provide an efficient recovery mechanism for misspeculations.
Many processors require a portion of main memory called a “stack” be available during operation. Early x86 microprocessors used the stack to save state information while handling exceptions and interrupts. Memory locations within the stack portion of main memory may be accessed using an stack segment and stack pointer (SS:SP or SS:ESP) register pair. The 16-bit SS (stack segment) register defines the base address of the portion of main memory containing the stack (i.e., the address of the “bottom” of the stack). The 16-bit SP (stack pointer) register may provide an offset from the base address of the current “top” of the stack. More modern x86 processors have a 32-bit ESP (extended stack pointer) register.
The stack is implemented as a last-in, first-out (LIFO) storage mechanism. The top of the stack is the storage location containing the data most recently stored within the stack. Data is “pushed” onto the stack (i.e. stored at the top of the stack) and “popped” from the stack (i.e. removed from the top of the stack). As data is pushed onto the stack, the ESP register is typically decremented. In other words, the x86 stack typically grows in a downward direction from the base address. When the stack is popped, the data removed is the data most recently pushed onto the stack.
The x86 architecture includes a relatively small number of registers which may be used to store data manipulated during software program execution. As a result, data used during software program execution is often stored within the stack. Accessibility of data stored within the stack is thus particularly important in achieving high microprocessor performance. On the other hand, the stack is a portion of the main memory, and accesses to the main memory are relatively slow. It would therefore be desirable to speed access to the stack portion of main memory.