1. Field of the Invention
The present invention relates to decoders for processors that have pipelined decoders and execution units.
2. Description of the Related Art
Driven by demand for faster processors, microprocessor manufacturers are continually developing new designs. Each new processor usually provides better performance (i.e., it must be faster than previous processors), introduces new capabilities or expands upon pre-existing capabilities and/or has reduced cost. Furthermore it is preferable that the new processor support all previously-supported instructions for that type of processor. Compatibility is an extremely valuable characteristic, for without software a new processor would not be immediately useful.
In order to increase processing speed of computer processors, "pipelined" structures have been developed. A pipelined computer processor includes a plurality of stages that operate independently of each other. When a first instruction has completed execution in one of the pipeline stages, the instruction moves on to the next stage, and a second instruction moves into the stage vacated by the first instruction. Processing speed is increased because multiple instructions are processed simultaneously in different stages of the pipeline. For example, a ten-stage pipeline can simultaneously process ten instructions in ten separate stages. In order to further increase processing speed, processors termed "superscalar" processors have been designed with multiple pipelines that process instructions simultaneously when adjacent instructions have no data dependencies between them. Even greater parallelism and higher performance can be provided by an out-of-order processor that includes multiple parallel units, in which instructions are processed in parallel in any efficient order that takes advantage of whatever opportunities for parallel processing may be provided by the instruction code.
As pipelines increase in complexity, it is has become increasingly difficult for the early stages (sometimes referred to as the "front-end") to communicate with the later stages (sometimes referred to as the "back-end"). The front-end may include stages such as instruction fetch and decode stage, and the back-end may include execution stages and retirement stages. It would be useful to have a mechanism that allows the front-end to communicate with the back-end in an efficient and precise manner.
Precise timing of the communication is of critical importance, particularly for microprocessors that-are designed for instruction sets in which precise timing of exceptions and other events is essential to preserve compatibility with previous processors. One example of an instruction set requiring precise fault timing is the well-established software written for the INTEL family of processors beginning with the 8086 and continuing with the 80286, i386.TM., 80486, and the Pentium.TM. processors. When introduced, the fault model of each of those processors was compatible with software written for previous models, so that the existing software base was fully usable. The term "INTEL instruction set" refers to software written for those processors.
As the INTEL instruction set developed, its capabilities expanded but so did its complexity. The INTEL instruction set is now very complex: it includes variable length instructions that may include prefixes, and a complex, segmented, paged memory management system. Designing a pipelined processor to handle all of these complex instructions, as well as dealing with faults, exceptions, cache misses, self-modifying code and other problems in a precise manner is becoming increasingly difficult.
It would be an advantage to provide a mechanism that a microprocessor designer could use to communicate between the front-end and back-end. Such a mechanism could handle a number of difficult problems that arise during decoding and later processing in the pipeline. For example, such a system could be useful to handle illegal opcodes, special instruction prefixes, code breakpoints, TLB faults, and self-modifying code while maintaining compatibility with the precise fault model of the INTEL instruction set. The flexibility afforded by such a tool could provide an efficient mechanism to ensure compatibility with previous processors.