1. Field of the Invention
The present invention relates to the field of computer systems. More specifically, the present invention relates to the art of caching decoded micro-operations with trace segments having multiple entry points.
2. Description of the Related Art
Historically, cached instructions are stored and organized in an instruction cache in accordance with the instructions' memory addresses. Each cache line stores a number of instructions that are spatially adjacent to each other in main memory. This historic approach to caching instructions has at least one disadvantage in that it typically requires multiple cache lines to be accessed whenever execution of a program necessitates branching from the middle of a cache line or branching into the middle of a cache line.
An alternative approach to organizing cached instructions is known, whereby cached instructions are organized by instruction trace segments. Each cache line stores an instruction trace segment comprising one or more basic blocks of instructions that are predicted to be sequentially executed. For example, in an embodiment where each cache line comprises two basic blocks of instructions, the second basic block of instructions includes instructions to be executed if the branch instruction located at the end of the first basic block is taken. Assuming the branch is predicted taken, the second basic block is placed in the same cache line.
Additionally, each cache line also includes the information necessary to continue fetching instructions upon reaching the end of the last basic block in the cache line, or in the event that the branch instruction at the end of a basic block earlier in the cache line is not taken as predicted. Each instruction trace segment is accessible only through the first instruction of the first basic block in a cache line. Thus, to facilitate direct fetching of any basic blocks within a cache line, these basic blocks are also cached in other cache lines as the first basic block. In other words, each basic block is cached twice.
A number of modem microprocessors employ micro-architectures to implement complex instruction set architectures, e.g. the Pentium.RTM. Pro processor, produced by Intel Corp., of Santa Clara, Calif. Each macro-instruction of a complex instruction set is implemented by way of multiple micro-instructions or micro-ops. Double caching is undesirable for caching micro-ops. Therefore, the historical spatial organization is typically employed for caching instructions.
For example, assume basic blocks A, B, C, and D represent code segments of an executing program. At one point in time block A is executed, followed by blocks C and D. Accordingly a trace segment ACD will be built. At a later time, block B is executed, followed by blocks C and D, which results in trace segment BCD being built. Because trace segment ACD can only be accessed through the first instruction of block A, the trace cache ignores the presence of blocks C and D and builds the new trace segment containing redundant code.
Thus, it is desirable to have a new approach for caching decoded micro-ops that allows trace segments to be entered from multiple entry points, thus reducing the degree of code redundancy present in the cache.