The present invention relates to data processing, and more specifically, to a coherent proxy for an attached processor.
A conventional distributed shared memory computer system, such as a server computer system, includes multiple processing units all coupled to a system interconnect, which typically comprises one or more address, data and control buses. Coupled to the system interconnect is a system memory, which represents the lowest level of volatile memory in the multiprocessor computer system and generally is accessible for read and write access by all processing units. In order to reduce access latency to instructions and data residing in the system memory, each processing unit is typically further supported by a respective multi-level cache hierarchy, the lower level(s) of which may be shared by one or more processor cores.
Because multiple processor cores may request write access to a same memory block (e.g., cache line or sector) and because cached memory blocks that are modified are not immediately synchronized with system memory, the cache hierarchies of multiprocessor computer systems typically implement a cache coherency protocol to ensure at least a minimum required level of coherence among the various processor core's “views” of the contents of system memory. The minimum required level of coherence is determined by the selected memory consistency model, which defines rules for the apparent ordering and visibility of updates to the distributed shared memory. In all memory consistency models in the continuum between weak consistency models and strong consistency models, cache coherency requires, at a minimum, that after a processing unit accesses a copy of a memory block and subsequently accesses an updated copy of the memory block, the processing unit cannot again access the old (“stale”) copy of the memory block.
A cache coherency protocol typically defines a set of cache states stored in association with cached copies of memory blocks, as well as the events triggering transitions between the cache states and the cache states to which transitions are made. Coherency protocols can generally be classified as directory-based or snoop-based protocols. In directory-based protocols, a common central directory maintains coherence by controlling accesses to memory blocks by the caches and by updating or invalidating copies of the memory blocks held in the various caches. Snoop-based protocols, on the other hand, implement a distributed design paradigm in which each cache maintains a private directory of its contents, monitors (“snoops”) the system interconnect for memory access requests targeting memory blocks held in the cache, and responds to the memory access requests by updating its private directory, and if required, by transmitting coherency message(s) and/or its copy of the memory block.
The cache states of the coherency protocol can include, for example, those of the well-known MESI (Modified, Exclusive, Shared, Invalid) protocol or a variant thereof. The MESI protocol allows a cache line of data to be tagged with one of four states: “M” (Modified), “E” (Exclusive), “S” (Shared), or “I” (Invalid). The Modified state indicates that a memory block is valid only in the cache holding the Modified memory block and that the memory block is not consistent with system memory. The Exclusive state indicates that the associated memory block is consistent with system memory and that the associated cache is the only cache in the data processing system that holds the associated memory block. The Shared state indicates that the associated memory block is resident in the associated cache and possibly one or more other caches and that all of the copies of the memory block are consistent with system memory. Finally, the Invalid state indicates that the data and address tag associated with a coherency granule are both invalid.