FIG. 1 shows a computing system 100 having a plurality of processing units 101_1 to 101_N. Processing units 101_1 to 101_N may correspond to instruction execution pipelines where system 100 corresponds to a processor, or, processing units 101_1 to 101_N may correspond to processors where system 100 corresponds to a multi-processor computing system.
Each of the processing units has its own respective internal one or more caching levels 102_1 to 102_N. The processing units also share one or more levels of common cache 103 and a deeper system memory 104. The collective goal of the cache levels 102_1 to 102_N, 103 is to minimize accesses to the shared memory 104 by keeping in the caches data items and instructions that are apt to be called upon by the processing units 101_1 to 101_N. However, as it is entirely possible that the respective program code running on the different processing units 101_1 to 101_N may wish to concurrently use a same item of data, a “coherency” protocol is implemented to ensure that the item of data remains “consistent” within the computing processor/system 100 as a whole.
A commonly used coherency protocol is the MESI protocol. The MESI protocol assigns one of four different states to any cached data item: 1) Modified (M); 2) Exclusive (E); 3) Shared (S); and, 4) Invalid. A cache line in the M state corresponds to a “dirty” cache line having recent, updated data that has not yet been updated to a deeper caching level (i.e., towards shared memory 104) or updated to memory 104 outright. Here, it is worthwhile to point out that in a typical implementation each caching level can support an M state for a particular cache line address. That is, a same cache line address can be in the M state in each of the caching levels. In this case, each higher level (i.e., toward the processing units), represents a more recent change to the cache line's data.
A cache line in the E state corresponds to data that is “clean”. That is, its data content is the same as its corresponding entry (i.e., same address) in shared memory 104. When new data is written to a cache line in the E state (e.g., by a processor directly at the highest caching level, or, when an evicted cache line from a next higher level is received at an intermediate caching level), the state of the cache line is changed to the M state.
When a cache line is in the M state and the cache line is evicted, the cache line's data must be written back to a next deeper caching level or to shared memory 104. If written back to a next deeper caching level it remains in the M state at the next deeper caching level. If written back to shared memory it can transition to the E state. While a cache line is in the M state, a processing unit is permitted to access the cache line (e.g., by way of a cache snoop) and even update it (write a new value to the cache line). According to one MESI implementation, a snoop of a cache line in the M state for a simple read causes the cache line state to transition from the M state to the S state. A read of the cache line with an intent to write back to it (“read-for ownership”) causes the cache line to transition from the M state to the I state.
A cache line in the S state typically corresponds to a cache line having multiple copies across the various caches 102_1 to 102_N, 103. In a typical situation, a single instance of a cache line is resident in the E state in the cache of a particular processor. If another processor desires the same cache line, a second copy of the cache line is sent to the requesting processor. The state of the cache line therefore changes from E to S as there are now two copies of the cache line in the system each having the same data as resides in shared system memory 104 for the associated address. Other aspects of the MESI protocol exist, however, such features are well know and need not be discussed here.
FIG. 2 shows an example of a processing unit 201_1 accessing a modified cache line from the (e.g., highest level of the) shared cache 203. As observed in FIG. 2, in a first cycle, the processing unit issues 1 a request for the cache line to the shared cache 203. The cache 203, during a second cycle, in response, reads 2 the cache line from the cache 203. In a third cycle, the cache line is presented 3 to the requesting processing unit 201_1. Notably, the cache line remains in the M state.