1. Technical Field
The present invention relates in general to the field of data processing systems. More particularly, the present invention relates to a cache coherency protocol for a data processing system.
2. Description of the Related Art
A conventional symmetric multiprocessor (SMP) computer system, such as a server computer system, includes multiple processing units all coupled to a system interconnect, which typically comprises one or more address, data and control buses. Coupled to the system interconnect is a system memory, which represents the lowest level of volatile memory in the multiprocessor computer system and which generally is accessible for read and write access by all processing units. In order to reduce access latency to instructions and data residing in the system memory, each processing unit is typically further supported by a respective multi-level cache hierarchy, the lower level(s) of which may be shared by one or more processor cores.
Because multiple processor cores may request write access to a same cache line of data and because modified cache lines are not immediately synchronized with system memory, the cache hierarchies of multiprocessor computer systems typically implement a cache coherency protocol to ensure at least a minimum level of coherence among the various processor core's “views” of the contents of system memory. In particular, cache coherency requires, at a minimum, that after a processing unit accesses a copy of a memory block and subsequently accesses an updated copy of the memory block, the processing unit cannot again access the old copy of the memory block.
A cache coherency protocol typically defines a set of cache states stored in association with the cache lines of each cache hierarchy, as well as a set of coherency messages utilized to communicate the cache state information between cache hierarchies. In a typical implementation, the cache state information takes the form of the well-known MESI (Modified, Exclusive, Shared, Invalid) protocol or a variant thereof, and the coherency messages indicate a protocol-defined coherency state transition in the cache hierarchy of the requestor and/or the recipients of a memory access request.
In a multiprocessor computer system, processing units often update cache lines stored in their local caches. Currently, once a first processing unit updates a cache line in the local cache. Other processing units with a copy of the cache line will mark their copy of the cache line as a cache line that needs to be updated. Some protocols have been proposed and implemented where a cache controller watches system data transactions and grab an updated copy of the cache line if the cache line happens to pass by the cache controller on the interconnect.
In today's hierarchical system structures, there is no guarantee that the cache line will become visible to all interested caches in time to avoid latency penalties. Other protocols involve a system-wide broadcast of the cache line after an update so that any caches that require a new copy of the cache line are guaranteed to see the updated cache line. However, these protocols are inefficient because the system-wide broadcasts of the updated cache line consume unnecessary bandwidth on the interconnect.
Therefore, because of the aforementioned limitations of the prior art, there is a need for a system and method of efficiently providing updated cache lines to caches that require a cache line update without consuming unnecessary bandwidth within a data processing system.