In the event of a cache miss, the time required for a microprocessor to access system memory can be one or two orders of magnitude more than the time required to access the cache memory, prefetch buffers or other storage elements within the microprocessor itself. For this reason, to reduce their memory latency, microprocessors incorporate prefetching techniques that examine recent data access patterns and attempt to predict which data the program will access next.
The benefits of prefetching are well known. However, prefetching can have harmful effects as well. For example, each prefetch request that goes out on the processor bus to memory consumes bandwidth of the bus and memory, which may already be congested. Additionally, the prefetch request may delay another request for data that is more urgently needed. For another example, if the data is prefetched into a cache memory, the prefetched cache line will typically cause an eviction of another cache line from the cache. If it turns out the evicted cache line is needed again sooner and/or more often than the prefetched cache line is needed, then the prefetch was likely detrimental to performance rather than helpful. Therefore, what is needed is an improved data prefetching mechanism.