In high-performance, superscalar microprocessors, a decoded instruction cache (or trace cache) is used to improve performance. This type of instruction cache improves the bandwidth, throughput, and latency of “fetch” and “decode” portions of microprocessors by quickly sending packets of decoded macro-instructions (called micro-operations) into the core of the microprocessor. At the end of the pipeline that fetches and decodes macro instructions, the micro-operations are typically assembled into packets and written into a trace cache on their way into an allocation pipeline.
For many applications, the trace cache performance is strongly correlated to hit rate. Large trace cache arrays provide high hit rates but consume a great deal of power. General-purpose applications exhibit different size requirements on the trace cache for realizing their performance benefits. Some applications require only a small size.
However for others, the performance continues to improve as the size is increased. If the trace cache is larger than is needed for a given application to achieve an acceptable level of performance, the over allocation of cache resources will consume unnecessary power. If the trace cache is too small, the application may not achieve an acceptable level of performance. Additional resources can be added to improve the performance. However, increased power consumption may degrade the performance benefit.