1. Technical Field
The present invention relates in general to data processing and, in particular, to data processing in a cache coherent data processing system.
2. Description of the Related Art
A conventional symmetric multiprocessor (SMP) computer system, such as a server computer system, includes multiple processing units all coupled to a system interconnect, which typically comprises one or more address, data and control buses. Coupled to the system interconnect is a system memory, which represents the lowest level of volatile memory in the multiprocessor computer system and which generally is accessible for read and write access by all processing units. In order to reduce access latency to instructions and data residing in the system memory, each processing unit is typically further supported by a respective multi-level cache hierarchy, the lower level(s) of which may be shared by one or more processor cores.
Because multiple processor cores may request write access to a same cache line of data and because modified cache lines are not immediately synchronized with system memory, the cache hierarchies of multiprocessor computer systems typically implement a cache coherency protocol to ensure at least a minimum level of coherence among the various processor core's “views” of the contents of system memory. In particular, cache coherency requires, at a minimum, that after a processing unit accesses a copy of a memory block and subsequently accesses an updated copy of the memory block, the processing unit cannot again access the old copy of the memory block.
A cache coherency protocol typically defines a set of coherency states stored in association with the cache lines of each cache hierarchy, as well as a set of coherency messages utilized to communicate the cache state information between cache hierarchies. In a typical implementation, the coherency state information takes the form of the well-known MESI (Modified, Exclusive, Shared, Invalid) protocol or a variant thereof, and the coherency messages indicate a protocol-defined coherency state transition in the cache hierarchy of the requester and/or the recipients of a memory access request.
In conventional multi-processor data processing systems, all levels of cache memory within a cache memory hierarchy are examined to determine their coherency state in response to a memory access request before an operation requesting a memory block is broadcast to other cache hierarchies in the data processing system. The present invention recognizes that this practice increases the access latency for the subset of memory access requests that miss in all levels of the cache hierarchy.