A multiprocessor computer system typically has a large number of interconnected processing nodes. Each node, in turn, can have up to 16 processors. Each processor, moreover, may have one or more memory caches. These caches hold programs and data required by the processors. Significant hardware in the computer system is dedicated to ensuring that each cache holds coherent data. That is, that each cache accurately reflects the contents of main memory.
In some multiprocessor systems, the caches are "strongly ordered." In a strongly ordered system, a processor sees the stores of other processors in the same node in the same order in which the stores are made.
In a strongly ordered system, the stores can be used as semaphores. Consider, for example, the sequence of stores shown below:
TABLE 1 ______________________________________ CPU 0 CPU 1 ______________________________________ Store A` Load B Store B` Load A ______________________________________
In this sequence, line A' is data while line B' is a semaphore protecting line A' from premature use. CPU 0 only modifies B' after it is through modifying A'. Since B is a semaphore, CPU 1, does not use A until after it sees that B has been modified. Strong ordering requires that CPU 1 never have the old value of A and the new value of B at the same time. Otherwise, CPU 1 could use stale data for A instead of the newly stored value.
In prior art systems, strong ordering was maintained by sending purge commands to other processors and receiving a response back. Thus, in the example of Table 1, CPU 0 would send a command to CPU 1 telling CPU 1 to purge line A. Once CPU 1 had completed the purge, it sent a "purge done" response back to CPU 0. CPU 0 waited until it received "purge done" responses back from all of the other processors before it modified B'.
A problem with waiting for "purge done" responses is the delay caused by waiting for the purge command and responses to make the round trip. This delay becomes considerable when it is multiplied by the number of stores performed by each processor during normal operation of the computer system. In addition, additional logic must be placed on each cache controller to send, receive, and count the purge commands and responses.
Therefore, there is a need in the art for a method and system of maintaining strong ordering that does not require sending purge commands and waiting for responses thereto.
In addition, there is a need in the art for a method and system of maintaining strong ordering that does not require additional counting logic on the cache controller.