1. Field
The disclosed embodiments generally relate to techniques for improving performance in computer systems. More specifically, the disclosed embodiments relate to the design of a processor, which includes a mechanism to filter out redundant software prefetch instructions, which access cache lines that have already fetched from memory.
2. Related Art
As the gap between processor speed and memory performance continues to grow, prefetching is becoming an increasingly important technique for improving computer system performance. Prefetching involves pulling cache lines from memory and placing them into a cache before the cache lines are actually accessed by an application. This prevents the application from having to wait for a cache line to be retrieved from memory and thereby improves computer system performance.
Computer systems generally make use of two types of prefetching, software-controlled prefetching (referred to as “software prefetching”) and hardware-controlled prefetching (referred to as “hardware prefetching”). To support software prefetching, a compiler analyzes the data access patterns of an application at compile time and inserts software prefetch instructions into the executable code to prefetch cache lines before they are needed. In contrast, a hardware prefetcher operates by analyzing the actual data access patterns of an application at run time to predict which cache lines will be accessed in the near future, and then causes the processor to prefetch these cache lines.
Many software prefetch instructions are redundant because a processor's hardware prefetchers are often able to eliminate the same cache misses. Note that redundant prefetches can reduce processor performance because they consume processor resources, such as execution pipeline stages and load-store unit bandwidth, without performing useful work. However, blindly filtering out all software prefetches or disabling all hardware prefetchers both degrade performance because there are some cache misses that only the software prefetches are able to eliminate and others that only the hardware prefetchers are able to eliminate.
Hence, it is desirable to be able to selectively eliminate redundant software prefetches without eliminating valid software prefetches.