Latency associated with system memory access may be reduced by placing a certain amount of data in a cache memory. By storing data in a cache memory, it may be accessed by a processor more quickly the next time this data is requested. High-performance processor architectures, however, have been shifting toward designs that feature multiple processing cores, each having the capability of executing multiple independent threads simultaneously. As a result, shared resources, like a least recently used (LRU) list for a cache memory, may become less effective at reducing latency due to input/output (I/O) bottlenecks arising from one or more of these high processing capabilities.