In computer systems, a central processing unit (CPU) and the memory upon which it relies may operate with different speeds, leading to memory latency issues. Latency issues may be mitigated using, for example, cache memories, pre-fetching apparatus, and so on. Latency issues may be exacerbated in multi-core systems (e.g., simultaneous multithreading system (SMT), chip level multithreading system (CMP)).
Data pre-fetching seeks to improve processor performance by predicting the memory location of required data and bringing it to the CPU ahead of time to avoid delays caused by waiting for a fetch from memory to complete. The efficiency of a pre-fetching apparatus may depend on attributes including, for example, precision of a pre-fetch prediction for a next data location, whether data is to be pre-fetched “just-in-time” for on demand load/store usage, and resolution of conflicts in processor resources between pre-fetch requests and regular requests.
An instruction pointer based pre-fetcher (IPP) uses a technique of data pre-fetching based on the instruction pointer. Some conventional pre-fetchers always pre-fetch (e.g., pre-fetch without history considerations). Other conventional pre-fetchers examine a simple history of locality based patterns to prefetch a next cache line and/or calculate history based on constant differences in a data pattern.