1. Field of the Invention
This invention is related to the field of microprocessors, and more particularly, to microprocessors having trace caches.
2. Description of the Related Art
Instructions processed in a microprocessor are encoded as a sequence of ones and zeros. For some microprocessor architectures, instructions may be encoded with a fixed length, such as a certain number of bytes. For other architectures, such as the x86 architecture, the length of instructions may vary. The x86 microprocessor architecture specifies a variable length instruction set (i.e., an instruction set in which various instructions are each specified by differing numbers of bytes). For example, the 80386 and later versions of x86 microprocessors employ between 1 and 15 bytes to specify a particular instruction. Instructions have an opcode, which may be 1-2 bytes, and additional bytes may be added to specify addressing modes, operands, and additional details regarding the instruction to be executed.
In some microprocessor architectures, each instruction may be decoded into one or more simpler operations prior to execution. Decoding an instruction may also involve accessing a register renaming map in order to determine the physical register to which each logical register in the instruction maps and/or to allocate a physical register to store the result of the instruction.
Typically, instructions are fetched from system memory into instruction cache in contiguous blocks. The instructions included in these blocks are stored in the instruction cache in compiled order. During program execution, instructions are often executed in a different order, such as when a branch is taken within the code. In such cases the instructions following the taken branch cannot generally be fetched from the instruction cache during the same cycle as the branch instruction because they are stored in non-contiguous locations. To attempt to overcome this instruction fetch bandwidth limitation, many superscalar microprocessors incorporate a trace cache.
Trace cache differs from instruction cache in that instructions stored in trace cache are typically stored in execution order as opposed to compiled order. Storing operations in execution order allows an instruction sequence containing a taken branch operation to be accessed during a single cycle from trace cache, whereas accessing the same sequence from instruction cache would require several cycles.
Superscalar microprocessors typically decode multiple instructions per clock cycle. The amount of hardware needed to match the addresses of each instruction within a group being decoded with the starting addresses of traces in the trace cache may be prohibitive. This may greatly increase the difficulty of determining a hit in the trace cache in some cases.