Many processors implement prefetching, which aims to bring data that will be used in the near future into a processor's cache memory system before it is actually needed. This reduces stalls in the processor's pipeline waiting for data from memory to be available. By bringing the data closer to the processor's core, prefetching can reduce average memory access latency, and improve cache hit rates. However, it is difficult to know ahead of time what information will be useful to prefetch. If inefficiently applied, prefetching can consume memory bandwidth and cache resources by prefetching data that is never used.
Conventional prefetch mechanisms typically issue prefetch requests as low priority demand requests which are processed by all levels of the cache hierarchy. These requests result in a main memory transaction if the data is not present in the cache system.