Computer systems, from small handheld electronic devices to medium-sized mobile and desktop systems to large servers and workstations, are becoming increasingly pervasive in our society. Computer systems typically include one or more processors. A processor manipulates and controls the flow of data in a computer by executing instructions. Increasing the speed at which instructions are executed by the processor tends to increase the computational power of the computer. Processor designers employ many different techniques to increase processor speed to create more powerful computers for consumers. One such technique is the use of cache memory.
Cache memory is a type of buffer memory that resides between the main memory and each processor of a computer system. Cache memory has a much smaller capacity than main memory and resides closer to the processor. Because of this, the processor can more quickly read data from the cache than from main memory. To exploit this characteristic of cache memory, complex schemes are implemented to predict what data a processor will need to read in the near future, and to transfer that data from main memory to the cache before the processor reads it. In this manner, data access speed and, consequently, processor speed, is increased. Typically, each processor in a multiprocessor computer system has its own, associated cache.
One problem with implementing caches in a computer system resides in the fact that a processor not only reads data from its cache but also writes data to its cache. Suppose, for example, that the same data is transferred into a first cache of a first processor and a second cache of a second processor. Initially, both processors read the data from their respective cache. Suppose, further, that the data in the first cache is eventually overwritten with newer, updated data while the original data in the second cache remains unchanged. If the second processor continues to read the original, unmodified, data from its cache, a cache coherence problem exists. That is, the unmodified data (also called stale or old data) in the second processor's cache becomes erroneous as soon at the first processor modifies the data in its own cache.
Somehow, all the processors in a multiprocessor system must be able to read only the “freshest” data from their respective caches to keep the overall system coherent. The mechanism by which the system is kept coherent is called the cache coherence protocol.
One type of protocol is known as the MESI cache coherence protocol. The MESI protocol defines four states in which a cache line may be stored. They are Modified, Exclusive, Shared, and Invalid. Unfortunately, the MESI protocol may lead to inefficient inter-device communications in some applications. These inefficiencies become more taxing on system performance as the bus bandwidth becomes more constrained.