Shared-memory multiprocessor (SMP) systems are typically built around a number of high-performance out-of-order superscalar processors, each of which employs aggressive branch prediction techniques in order to achieve high issue rate. During program execution, these processors speculatively execute the instructions following the target of a predicted branch instruction. When a branch is mispredicted, the processor must restore its state to the state that existed prior to the mispredicted branch before the processor can start executing instructions down the correct path. However, during speculative execution, i.e., before the branch outcome is known, the processor speculatively issues and executes many memory references down the wrong-path. Although these wrong-path memory references are not allowed to change the processor's architectural state, they do change the data and instructions that are in the memory system, which can affect the processor's performance.
Previous analyses have studied the effects that speculatively executed memory references have on the performance of out-of-order superscalar processors. Wrong-path memory references may function as indirect prefetches by bringing data into the cache that are needed later by instructions on the correct execution path. Unfortunately, these wrong-path memory references also increase the amount of memory traffic (i.e., increased bandwidth consumption) and can pollute the cache with cache blocks that are not referenced by instructions on the correct path. Of these two effects, cache pollution—particularly in the L2 cache—is the dominant negative effect.
There is a need for a more efficient and economical cache memory system, and in particular, a cache memory system that retains the positive effects of prefetching, but improves the performance of an SMP system without significantly increasing the complexity of the memory subsystem.
In this study, we proposed an enhancement that tries to minimize the negative effects of wrong-path memory references, while retaining their positive effects (i.e., prefetching), to improve the performance of an SMP system without significantly increasing the complexity of the memory subsystem. Specifically, we propose and evaluate a cache replacement policy that is wrong-path aware. For this purpose, we add a field to each cache line to indicate whether or not that cache line was due to an instruction on the correct-path or the wrong-path. When evicting a cache block from a set, evict the oldest wrong-path cache block. Our results show that this simple mechanism can significantly reduce the negative impact that wrong-path memory accesses have on the performance of SMP systems.