A processor is commonly considered to be the "brains" of a computer system. To increase the processing power of a computer system, some systems contain more than one processor. These computer systems are referred to as multiprocessor computer systems. The processors in these systems typically share memory so that two or more processors have access to the same data in a particular memory address space. Even in computer systems that contain only a single processor, the processor may share memory with a peripheral device, such as a bus master, that also has access to the memory. Sharing memory in this manner necessitates a memory coherence protocol to ensure that all of the devices with access to the shared memory have the same view of the data in memory. For example, once one device updates a particular data value, the other devices must be able to access the updated data value for their own use.
Suppose a first processor in a multiprocessor system loads a data value from an address in a shared memory location, such as a shared cache, into the first processor's own dedicated memory, such as a local cache, during a first bus transaction. If a second processor in the system loads the same data value from the address in the shared memory location into its own local cache, each processor will have a copy of the same data value stored in its local cache.
Initially, the data values are brought into the local caches of each of the two processors in a shared state. This means that there is an indicator corresponding to the data value, such as one or more flag bits in the cache line containing the data value, that indicates to the processor that another device in the computer system may contain a cached copy of the same data value.
Assume that the second processor uses the data value as an operand in various lines of program code, but does not modify the data value. Meanwhile, the first processor modifies the data value by, for example, performing a mathematical operation on the data value. The first processor then stores the modified data value in the same address as the original data value. Once the data value is modified by the first processor, the second processor must no longer continue to use the older, invalid data stored in the second processor's local cache or else the results of the second processor's operations may be erroneous.
To prevent the second processor from using the older copy of the data value, the data value in the second processor's local cache is invalidated. Invalidation of the data value is requested during a separate bus transaction by the first processor before the first processor modifies its copy of the data value. This request causes the second processor to invalidate its copy of the data value by, for example, setting one or more indicator bits in the cache line containing the data value.
After the first processor requests invalidation of other copies of the data value, the state of the data value stored in the first processor's local cache is changed from a shared state to an exclusive state. This means that the indicator corresponding to the data value, such as one or more flag bits in the cache line containing the data value, is changed to indicate that no other device in the computer system contains a (valid) copy of the same data value.
Once the data value in the first processor transitions from a shared state to an exclusive state, the first processor is free to modify the data value. This new, updated data value is stored in the original address in the shared memory location of the computer system. When the second processor next needs the data value, the second processor will re-access the new, updated data value from the shared memory location and pull this updated data value back into the second processor's local cache in a shared state.
One problem with the above-described cache coherency protocol is that the first processor must wait for a relatively long time before the first processor can modify or otherwise update the data value and store the updated data value in the original address location. The first processor must first bring the data value into its local cache in the shared state during a first bus transaction, then broadcast a request to invalidate other copies of the data value during a second bus transaction, and then transition the data value to the exclusive state before updating the data value.