1. Field of the Invention
This invention is related to the field of cache coherency.
2. Description of the Related Art
Computer systems often employ one or more caches to provide low latency access to blocks stored therein, as compared to the latency to access main memory. If multiple caches are employed, and the caches may independently store the same cache block for access and potential modification by different local consumers (such as processors coupled to the cache), the issue of cache coherence arises.
Various cache coherency protocols have been devised to maintain cache coherency. Typically, the cache coherency protocol specifies a set of states that a given cache block may have when stored in the cache. Additionally, the cache coherency protocol specifies transitions between the states, as well as any communications with other caches used to change states of other copies of the same cache block. In a given state, a given access (e.g. read or write) either may be permitted without state change or may require a state change. In some cases, the state change may require communication between the caches as mentioned above. In these cases, a state change from one stable state to another cannot be performed atomically since the communication must also be transmitted to other caches, and those caches may also need to make a state change.
When a state change is required to permit an access to a cache, and the state change also requires a communication, transient states are introduced to the cache coherency protocol. Transient states are entered when a cache access occurs that requires a state change in other caches. Transitions from the transient state to the new state required for the access occur when the communication has been ordered with respect to other possible communications at all caches. A different transient state is provided for each possible case of a state change being required in another cache.
The transient states are provided to resolve race conditions in the cache coherency protocol. Since the caches operate independently to respond to accesses from their local consumers (e.g. processors), accesses in different caches to the same cache block may occur at approximately the same time and may require communications (global state changes) to complete. One of the communications from one of the caches will effect a state change before the others. Thus, a given communication to effect a state change may not occur successfully prior to other communications affecting the underlying cache block.
FIG. 1 is a block diagram illustrating an example of the popular Modified, Exclusive, Shared, and Invalid (MESI) cache coherency protocol for a system in which caches are coupled via a bus (on which snooping is performed) and the local consumers are processors. In the MESI protocol, there are four stable states: M, E, S, and I. The M state indicates that the cache block has been modified in the cache. The E state indicates that the cache block has not been modified in the cache, but no other cache has a copy and thus a modification in the cache is permissible without any communication with other caches. The S state indicates that the cache block is (or was at some previous time) shared with at least one other cache. That is, another copy of the cache block may be stored in another cache if the cache block is in the S state. Thus, the copy may be read but not modified without a bus transaction to invalidate other shared copies. The I state indicates that the cache block is not valid in the cache.
There are also three transient states in FIG. 1: I->S, E; S->M; and I->M. The transient states each indicate the current state from which a transition is occurring for the cache block and the new state to which the transition is to complete. That is, the transient state I->M indicates a transition from the invalid state to the modified state. Similarly, the transient state S->M indicates a transition from the shared state to the modified state. The transient state I->S, E indicates a transition from the invalid state to either the shared state or the exclusive state (dependent on a response to the bus transaction initiated for the state change).
In FIG. 1, the current state of a cache block is shown to the left of heavy line 10. Above another heavy line 12, various events that may affect the current state are shown. Below heavy line 12 and to the right of heavy line 10 is a table of various cache states. The state in the table at the intersection of a current state and an event is the next cache state for a cache block in the current state if the event occurs to that cache block. A dash in the table indicates that no state change occurs for that event/current state combination. The events include a processor read, a processor write, a bus grant for the transaction required to complete a state change, a bus read (a read from the bus initiated by another cache or bus device) and a bus read exclusive/upgrade (initiated to obtain a modified state in the source's cache). In this example, the bus is the point at which transactions from different sources are ordered. Thus, a bus grant to perform a transaction may be enough to know that the state change in other caches in response to that transaction will be committed before any new transactions are transmitted on the bus.
For the stable states in FIG. 1, state transitions to another stable state or to a transient state occur in response to processor reads and writes. Also, for the stable states, transitions occur to other stable states in response to bus read transactions and bus read exclusive/upgrade transactions received from the bus. For the transient states, transitions occur to a stable state when the bus grant for the corresponding transaction occur. In the case of the transient state I->S, E, the stable state is either shared (if the snoop response (SR) to the transaction on the bus is shared) or exclusive (if the SR is not shared). Additionally, for the S->M transient state, a transition to the I->M transient state occurs if a bus read exclusive/upgrade occurs to the cache block (since the cache block is invalidated in the cache in response to the transaction).
The inclusion of the various transient states complicates the design and verification of devices that implement the cache coherency protocol. Particularly, it is often the transient states that are the source of incorrect coherence functionality in a design. As designs and/or cache coherency protocols become more complex, the number of corner cases and/or potential race conditions generally increases, increasing the number of transient states.