1. Field of the Invention
This invention is related to the field of computer systems and, more particularly, to coherence mechanisms in computer systems.
2. Description of the Related Art
Historically, shared memory multiprocessing systems have implemented hardware coherence mechanisms. The hardware coherence mechanisms ensure that updates (stores) to memory locations by one processor (or one process, which may be executed on different processors at different points in time) are consistently observed by all other processors that read (load) the updated memory locations according to a specified ordering model. Implementing coherence may aid the correct and predictable operation of software in a multiprocessing system. While hardware coherence mechanisms simplify the software that executes on the system, the hardware coherence mechanisms may be complex and expensive to implement (especially in terms of design time). Additionally, if errors in the hardware coherence implementation are found, repairing the errors may be costly (if repaired via hardware modification) or limited (if software workarounds are used).
In order to limit the potential for error, computer systems have typically implemented deterministic coherence mechanisms. For example, one coherence mechanism frequently used in a distributed shared memory system (in which nodes are coupled together to form a system, with each node having a local memory that is part of the overall system memory) is directory based. In a directory-based coherence mechanism, each node tracks the coherence state of coherence units in its local memory that are being shared by other nodes (e.g. other nodes may have shared or modified copies of the coherence units). When a request is received for a given coherence unit, a given directory entry corresponding to that coherence unit is blocked so that other requests to the same coherence unit can not be started until the current request completes. This simplifies the mechanism, since coherence activity for more than one request to a given coherence unit does not overlap. However, performance suffers, especially for coherence units that are heavily shared/contended for in the system. Generally, a coherence unit may be any block of data that is treated as a unit for coherence purposes. In many cases, a coherence unit is the same as a cache line, although coherence units may be less than a cache line in size or larger than a cache line in size in various embodiments.