Embodiments presented herein are related to data prefetching in a processor, and more specifically, to identifying data streams that do and do not benefit from prefetching.
Data prefetching is a technique that allows a processor to reduce stall time on data accesses. Rather than waiting for a cache miss to initiate a memory fetch, a prefetcher in the processor observes, e.g., in a cache memory, data streams referencing patterns and predicts future references based on such patterns. The prefetcher then retrieves the predicted reference data from the cache memory before the processor actually references the data. Doing so allows the processor to reduce memory access latency and thus increase performance of the processor.
Generally, data prefetch techniques establish streams based on predicted patterns. An initial access to an established stream is referred to as an allocation. Further, each subsequent access to that stream (i.e., an actual demand for a given cache line) is referred to as a confirmation. The prefetcher may determine whether to issue a request to prefetch data from a given stream based on the depth of the stream, i.e., an amount of confirmations observed in the stream. Typically, the prefetcher may drop requests if the depth for the stream is low, e.g., the stream has no confirmations. However, one drawback to such an approach is that, occasionally, once the prefetch engine observes at least one confirmation, the prefetcher may determine that the next subsequent accesses to the stream are sequential cache lines and, as a result, blindly issues prefetch requests, even if the accesses do not correspond to sequential cache lines. Consequently, such superfluous prefetches may evict useful cache lines, causing future misses on those cache lines and/or may consume more bandwidth than necessary.