Consumers continue to demand faster computers. Simultaneous multi-threading (SMT) is an effective way to boost throughput performance with limited impact on processor die area. SMT increases processor throughout by executing a plurality of processing threads in parallel. However, many software applications do not benefit from SMT.
In addition, the gap between processor and memory speed continues to widen. As a result, computer performance is increasingly determined by the effectiveness of the cache hierarchy. Prefetching is a well-known and effective technique for improving the effectiveness of the cache hierarchy. However, processor workloads typically incur significant cache misses.