The present application relates in general to data processing and more particularly to data caching in a data processing system.
A conventional symmetric multiprocessor (SMP) computer system, such as a server computer system, includes multiple processing units all coupled to a system interconnect, which typically comprises one or more address, data and control buses. Coupled to the system interconnect is system memory, which represents the lowest level of directly addressable memory in the multiprocessor computer system and generally is accessible for read and write access by all processing units. In order to reduce access latency to instructions and data residing in the system memory, each processing unit is typically further supported by a respective multi-level cache hierarchy, the lower level(s) of which may be shared by one or more processor cores.
Typically, when a congruence class of a set-associative cache becomes full, a victim cache line is selected for removal from the congruence class and the contents of the cache line are evicted to make room for a new cache line. The evicted cache line may then be discarded or written to a lower-level cache or system memory. Because cache accesses tend to exhibit temporal locality, the victim cache line is often selected based on which cache line of the congruence class has been least recently accessed, that is, using a least recently used (LRU) algorithm.