Modern high-end processors require high throughput front-ends in order to cope with increasing performance demand. Fetching a large number of useful instructions (e.g., instructions residing on the correct path) per cycle is relatively easy task if those instructions are arranged sequentially. However, on an average every few instructions there is a branch instruction. Therefore, fetching a large number of useful instructions per cycle requires accurate prediction of multiple branch targets and the assembly of instructions from multiple non-contiguous cache blocks is a more difficult task.
One way of increasing the effectiveness of conventional fetch is reducing the number of fetch discontinuities along the execution path of the program. Loop unrolling, code motion, and branch alignment are a few examples of compiler techniques that reduce the number taken branches and, subsequently, instruction stream discontinuities in programs executions. Special Instruction Set Architectures (ISAs) have also been proposed to better express such compiler optimizations to the hardware. A trace cache is an instruction memory component that stores instructions in the order they are executed rather than in their static order as defined by the program executable. Trace cache is becoming an important building block of modem, wide-issue processors. Trace-caches have been shown to effectively increase the number of useful instructions that can be fetched into the machine (i.e., processor), thus increasing the average number of instructions the machine can execute each cycle. Trace-cache research has been focused on increasing fetch bandwidth for a given die area. Techniques that aim to improve instruction bandwidth severely increase the number of traces built during the execution. This causes degradation in the performance or power of the system since trace-cache consumes large power when active. Therefore, there is a need to have a trace cache organization that would reduce the number of trace built during the execution without performance or power degradation.