1. Field of the Invention
The invention related to prefetch techniques and, in particular, to automatic prefetch based on detection by a processor of likely pointer values.
2. Description of the Related Art
Computer systems typically include, amongst other things, a memory system and one or more processors and/or execution units. The memory system serves as a repository of information, while a processor reads information from the memory system, operates on it, and stores it back. As processor speeds and sizes of memory systems have increased, the mismatch between the ability of the processor to address arbitrary stored information and the ability of the memory system to quickly provide it has increased. To address this mismatch, memory systems are typically organized as a hierarchy using caching techniques that are well understood in the art.
In general, caches can be used to reduce average latency problems when accessing (e.g., reading or writing) main memory. A cache is typically a small, specially configured, high-speed memory that represents a small portion of the information represented in main memory. By placing the cache (small, relatively fast, expensive memory) between main memory (large, relatively slow memory) and the processor, the memory system as a whole is able to satisfy a substantial number of requests from the processor at the speed of the cache, thereby reducing the overall latency of the system. Some systems may define multiple levels of cache.
When the data requested by the processor is in the cache (a “hit”), the request is satisfied at the speed of the cache. However, when the data requested by the processor is not in the cache (a “miss”), the processor must wait until the data is provided from the slower main memory, resulting in greater latency. Typically, useful work is stalled while data is supplied from main memory. As is well known in the art, the frequency of cache misses is much higher in some applications or execution runs than in others. In particular, accesses for some database systems tend to miss in the cache with higher frequency than some scientific or engineering applications. In general, such variation in cache miss frequencies can be traced to differing spatial and temporal locality characteristics of the memory access sequences.
In some applications, particularly those characterized by array accesses, hardware techniques can be employed to predict subsequent accesses. Stride prediction techniques and associated hardware prefetch strategies are one such example. However, in many applications, it is difficult for hardware to discern and predict memory access sequences and software techniques may be alternatively or additionally employed. For example, to increase the likelihood of cache hits and thereby improve apparent memory access latency, some computer systems define instructions for prefetching data from memory to cache. The assumption is that software (e.g., either the programmer or a compiler) may be in a reasonable position to identify prefetch opportunities.
Unfortunately, for certain classes of applications, conventional hardware and software prefetch techniques are not particularly effective. For example, in some applications, performance is dictated by how well a processor can access data represented in data structures that are traversed using pointers. Particularly in complex data structures for which component objects are dynamically-allocated and freed throughout execution and accesses do not exhibit a high degree of spatial and temporal locality, access patterns may be difficult for conventional techniques to discern. Data structures that are typically employed in relational database systems often present such prefetch challenges.