1. Technical Field
The present invention relates in general to data processing and, in particular, data prefetching.
2. Description of the Related Art
As system memory latencies have increased in terms of processor clock cycles, computer architects have applied significant design effort to improvements in data caching (for handling previously used data) and data prefetching (for retrieving data in anticipation of use). Enhancements to data caching and data prefetching tend to be complementary in that enhancements to data caching techniques tend to achieve greater latency reductions for applications having significant data reuse, while enhancements to data prefetching tend to achieve greater latency reductions for applications having less data reuse.
In operation, hardware data prefetchers generally detect patterns of memory accesses forming one or more sequential address streams. A sequential address stream is defined as any sequence of memory accesses that reference a set of cache lines with monotonically increasing or decreasing addresses. The address offset between the addresses of adjacent memory accesses in a particular sequential address stream is often referred to as the “stride”. In response to a detection of a sequential address stream, the hardware data prefetcher then prefetches up to a predetermined number of cache lines into a low latency data cache in advance of a current demand memory access.
Unfortunately, in many designs, aggressive data prefetching can exacerbate already lengthy demand memory access latencies by overwhelming memory controllers with a large number of data prefetch requests.