1. Field of the Invention
This invention is related to the field of microprocessors, and more particularly, to microprocessors having trace caches.
2. Description of the Related Art
Instructions processed in a microprocessor are encoded as a sequence of ones and zeros. For some microprocessor architectures, instructions may be encoded with a fixed length, such as a certain number of bytes. For other architectures, such as the x86 architecture, the length of instructions may vary. The x86 microprocessor architecture specifies a variable length instruction set (i.e., an instruction set in which various instructions are each specified by differing numbers of bytes). For example, the 80386 and later versions of x86 microprocessors employ between 1 and 15 bytes to specify a particular instruction. Instructions have an opcode, which may be 1-2 bytes, and additional bytes may be added to specify addressing modes, operands, and additional details regarding the instruction to be executed.
In some microprocessor architectures, each instruction may be decoded into one or more simpler operations prior to execution. Decoding an instruction may also involve accessing a register renaming map in order to determine the physical register to which each logical register in the instruction maps and/or to allocate a physical register to store the result of the instruction.
In general, the bandwidth of the instruction fetch and decode portions of a microprocessor may determine whether the execution cores are fully utilized during each execution cycle. Accordingly, it is desirable to be able to provide enough bandwidth in the instruction fetch and decode portions of the microprocessor to kept the execution core as fully supplied with work as possible.
Typically, instructions are fetched from system memory into instruction cache in contiguous blocks. The instructions included in these blocks are stored in the instruction cache in compiled order. During program execution, instructions are often executed in a different order than compiled order. For example, when a branch is taken within the code, non-sequential instructions (in compiled order) may be executed sequentially. In such cases, the instructions following the taken branch cannot generally be fetched from the instruction cache during the same cycle as the branch instruction because they are stored in non-contiguous locations. To attempt to overcome this instruction fetch bandwidth limitation, a superscalar microprocessor may incorporate a trace cache.
Trace cache differs from instruction cache in that instructions stored in trace cache are typically stored in execution order as opposed to compiled order. Storing operations in execution order allows a trace containing one or more taken branch operations to be accessed during a single cycle from trace cache, whereas accessing the same sequence from instruction cache would require several cycles.