A processor (such as a microprocessor) processes instructions according to an instruction set architecture. The processing comprises fetching, decoding, and executing the instructions. Some instruction set architectures define a programming model where fetching, decoding, executing, and any other functions for processing an instruction are apparently performed in strict order, beginning after the functions for all prior instructions have completed, and completing before any functions of a successor instruction has begun. Such an instruction set architecture provides a programming model where instructions are executed in program order.
Some processors process instructions in various combinations of overlapped (or non-overlapped), parallel (or serial), and speculative (or non-speculative) manners, for example using pipelining in functional units, superscalar issue, and out-of-order execution. Thus some processors are enabled to execute instructions and access memory in an order that differs from the program order of the programming model. Nevertheless, the processors are constrained to produce results consistent with results that would be produced by processing instructions entirely in program order.
In some instruction set architectures, instructions are characterized as being either sequential or non-sequential, i.e. specifying a change in control flow (such as a branch). Processing after a sequential instruction implicitly continues with a next instruction that is contiguous with the sequential instruction, while processing after a change in control flow instruction optionally occurs with either the contiguous next instruction or with another next instruction (frequently non-contiguous) as specified by the control flow instruction.
Some instruction set architectures define one or more conditions that are exceptions that alter the normal sequence of instructions, above and beyond sequential and non-sequential instruction control flow. Examples of exceptions comprise an interrupt for a peripheral device, an overflow for an arithmetic calculation, a protection violation for a memory access, and a breakpoint for debugging. An instruction set architecture that requires exceptions to be handled consistently with the program order provides precise exceptions. In addition to exceptions defined by the architecture, in some situations a processor processes similar events that are specific to an implementation, although transparent to the programming model. For example, a processor that predicts branches to execute instructions speculatively also handles incorrect branch predictions. A variety of techniques, such as reorder buffers and history buffers, have been applied to implement precise exceptions for processors that execute instructions in overlapped, parallel, and speculative manners. For example, see “Implementing Precise Interrupts in Pipelined Processors” by J. E. Smith and A. R. Pleszkun in IEEE Transactions on Computers, 37, 5 (May. 1988), pages 562-573.
Some instruction set architectures comprise flags that monitor conditions associated with some instructions, and the flags also control aspects of execution of some instructions. For example, an instruction performs an add operation, modifying a carry flag to indicate whether there was a carry out from the result. A subsequent instruction performs an add-with-carry operation that uses the carry flag as carry input to the addition calculation. In some instruction set architectures additional flags indicate other conditions, such as whether a calculated result is negative, zero, or positive. Some processors implement mechanisms to provide flags for an X86-compatible instruction set architecture (for example, see U.S. Pat. No. 5,632,023 issued to White et al.).
Some instruction set architectures (such as an X86-compatible architecture) comprise complex instructions. Some microprocessor implementations comprise translation hardware to convert the instructions (including complex instructions) into sequences of one or more relatively simpler operations referred to as micro-operations. Additionally, certain implementations store sequences of micro-operations that correspond to one or more instructions in a cache, such as a trace cache. For example, Intel's Pentium 4 microprocessor, as described by Hinton et al (in “The Microarchitecture Of The Pentium 4 Processor”, Intel Technology Journal Q1, 2001), has a trace cache.
Furthermore, it has been proposed to optimize the micro-operations that correspond to a trace, such as by combining, reordering, or eliminating micro-operations. For example, see “Putting the Fill Unit to Work: Dynamic Optimizations for Trace Cache Microprocessors” by Friendly at al in Proceedings of the 31st Annual ACM/IEEE international Symposium on Microarchitecture, pages 173-181. Sometimes the micro-operation optimizing blurs sequencing and boundaries of instructions along with associated status flag modifications. For example, status flag modifications may be reordered or eliminated.
All of the foregoing patents and references are hereby incorporated by reference for all purposes.