A primary factor in the utility of a computer system is the speed at which the computer system can execute an application. It is important to have instructions and data available at least as fast as the rate at which they can be executed, to prevent the computer system from idling (stalling) while it waits for the instructions and/or data to be fetched from main memory.
A widely used solution to reduce or prevent stalling is to implement a hierarchy of caches in the computer system. In essence, one or more caches are situated between the main memory and the central processing unit (CPU). The caches store recently used instructions and data based on the assumption that information might be needed again. By storing information in a hierarchical manner, the caches can reduce latency by providing information more rapidly than if the information had to be retrieved from, for example, the main memory.
The closer a cache is to the CPU, the shorter the latency between the cache and the CPU. The cache closest to the CPU is usually referred to as the level one (L1) cache, the next cache is usually referred to as the level two (L2) cache, and so on. Information most likely to be needed by the CPU, or information more recently accessed by the CPU, is stored in the L1 cache, the next tier of information is stored in the L2 cache, and so on.
Latency can be further reduced by prefetching information into the caches. Prefetching involves, in essence, making a prediction of the information that may be needed by an application, and then prefetching that information from, for example, the main memory into a cache, or from one cache into a cache that is closer to the CPU (e.g., from the L2 cache to the L1 cache).
Hardware-initiated prefetching is typically based on a pattern-matching mechanism. The traffic stream (e.g., the stream of access requests for instructions or data) is monitored to try to find a pattern to the requests. If a pattern can be found, then that pattern can be used to anticipate subsequent requests for information, so that information can be prefetched. For example, if the prefetcher determines that data has been requested from addresses 2, 4, and 6 in the L2 cache because of cache misses in the L1 cache (e.g., a pattern of every other address, corresponding to every other cache line), then the prefetcher can anticipate that the cache line at address 8 might also be needed and can prefetch that cache line.
There is a basic tradeoff in prefetching. As noted above, prefetching can improve performance by reducing latency. On the other hand, if too much information (e.g., too many cache lines) is prefetched, then the efficiency of the prefetcher may be reduced. Furthermore, if too much information is prefetched, then the cache might become polluted with cache lines that might not actually be needed. If the cache is full, then prefetching new cache lines into the cache can cause useful lines to be prematurely evicted in order to make room for the new lines.
The benefits and risks of prefetching both can increase as the prefetch distance is increased. The prefetch distance is a measure of how far to prefetch based on an observed pattern. If, for instance, data is fetched from addresses 2, 4, and 6 (a pattern of every other address), then data can be prefetched from address 8 if the prefetch distance is one, from addresses 8 and 10 if the prefetch distance is two, and so on. In general, the prefetch distance specifies the number of accesses projected along a pattern starting from a starting point in the pattern (usually, from the last demand access that is a part of the pattern).
The prefetch distance can be managed using a confidence value associated with the pattern. The confidence value, in effect, is a measure of how often the pattern is observed or, equivalently, the number of elements that make up the pattern. The confidence value, and hence the prefetch distance, may initially be zero; that is, prefetching might not begin as soon as an apparent pattern is detected. Instead, prefetching might begin only if the pattern is observed repeatedly; each time the pattern is observed, the confidence value can be incremented, and the prefetch distance can be increased when the confidence value reaches a threshold. In the example above, if the pattern indeed continues as expected and ends up including addresses 8 and 10 in addition to addresses 2, 4, and 6, then the confidence value might be incremented and prefetching can begin. If the pattern continues beyond address 10, then the confidence value and consequently the prefetch distance can again be increased. In other words, if the actual pattern continues to match the predicted pattern, then the confidence value can be increased and, in turn, the prefetch distance can be increased.