In a traditional symmetrical multiprocessor system ("SMP"), data coherency is maintained by a relatively time consuming but effective procedure. For example, if a requestor (e.g., central processing unit ("CPU"), input/output ("I/O")) within the system desires a particular portion of data (e.g., a cache line), it will first determine whether or not the requested data is located within its local cache. If the data is not in the cache, a Load Miss (or Read Miss) request is then sent to the memory controller, which controls system memory, requesting that the data be supplied to the CPU from system memory. Typically, the memory controller includes a directory that indicates whether or not the requested data is located within system memory, or whether a particular CPU within the SMP system currently has ownership of the most recent version of the requested data. If system memory contains the most recent version of the requested data, then the memory controller supplies that data to the requesting CPU. However, if the memory controller determines, through its directory, that a second CPU within the SMP system contains the most recent version of the requested data, then a cross-interrogation message is sent to the second CPU requesting that it return ownership of the most recent copy of the requested data to the memory controller so that the memory controller can then transfer ownership of that data portion to the originally requesting CPU. Upon receipt of the cross-interrogation message, the second CPU then writes back the data portion to the memory controller, which then transfers the data to the requesting CPU.
As can be readily seen, such a procedure involves numerous steps, each requiring several system cycles to perform.
In an enhanced directory-based SMP system (i.e., one which includes a facility to handle cache-to-cache transfers of data between CPUs), several of these steps, and their corresponding cycle times can be eliminated, resulting in a faster transfer of ownership of a requested data portion within an SMP system. In such a system, instead of the second CPU (which contains the exclusive (modified) copy of the requested data) returning the requested data to the memory controller, a cache-to-cache transfer of the data from the second CPU to the first CPU is performed. Upon receipt of the cache-to-cache transfer of the requested data, the first CPU returns an acknowledgement of the receipt to the second CPU. With such a protocol, several of the aforementioned steps can be eliminated.
There are two more steps that are performed independently by the two CPUs. First, the second CPU will return an acknowledgement to the memory controller of its cache-to-cache transfer to the first CPU. This acknowledgment may include a copy of the data in the same form as that transferred to the first CPU. Second, the first CPU, which has now acquired ownership of the requested data through the cache-to-cache transfer, may further modify the data and then perform a write-back to system memory.
A problem may occur since ownership of the data may be transferred back to system memory before the memory directory receives the acknowledgment from the second CPU. With such scenario, the memory controller cannot identify whether the incoming data is valid or not, and a stale copy of the data may be written back destroying a good copy within system memory. As a result, there is a need in the art for a technique that insures that the most recent valid copy of data that has been transferred from one CPU to another in a cache-to-cache transfer is eventually stored within system memory.