In an effort to improve and optimize performance of processor systems, many different prefetching techniques (i.e., anticipating the need for data input requests) are used to remove or “hide” latency (i.e., delay) of processor systems. In particular, prefetch algorithms (i.e., pre-execution or pre-computation) are used to prefetch data for cache misses associated with data addresses that are difficult to predict during compile time. That is, a compiler first identifies the instructions needed to generate data addresses of the cache misses, and then speculatively pre-executes those instructions.
In general, linked data structures (LDSs) are collections of software objects that are traversed using pointers found in the preceding object(s). Traversing an LDS may result in a high latency cache miss on each object in the LDS. Because an address of an object in the LDS is loaded from the preceding object before the object itself may be loaded, cache misses may be unavoidable. On the other hand, when accessing a data array structure where the address of subsequent objects may be calculated from the base of the data array structure, loops may be unrolled and techniques such as stride prefetching may be performed to avoid cache misses while iterating through the data array structure. These techniques assume that the address of subsequent objects may be calculated using the base of the data array structure. However, most LDSs do not have layout properties that may be exploited by stride prefetching techniques. Further, the gap between processor and memory speeds continues to increase. As a result, managed runtime environments (MRTEs) may encounter difficulties when attempting to insert prefetch instructions properly to reduce latencies while traversing LDSs.