1. Field of the Invention
The present invention generally relates to a computer system with multiple processors. More preferably, the present invention generally relates to the sharing of data among processors in a Distributed Shared Memory (“DSM”) computer system. Still, more particularly, the invention relates to a scalable high performance directory based cache coherence protocol that allows data sharing among processors in a DSM computer system.
2. Background of the Invention
Distributed computer systems typically comprise multiple computers connected to each other by a communications network. In some distributed computer systems, the network computers can access shared data. Such systems are sometimes known as parallel computers. If a larger number of computers are networked, the distributed system is considered to be “massively” parallel. One advantage of a massively parallel computer is that it can solve complex computational problems in a reasonable amount of time.
In such systems, the memories of the computers are collectively known as a Distributed Shared Memory (“DSM”). It is a problem to ensure that the data stored in a DSM is accessed in a coherent manner. Coherency, in part, means that only one processor can modify any part of the data at any one time, otherwise the state of the system would be nondeterministic.
Recently, DSM systems have been built as a cluster of Symmetric Multiprocessors (“SMP”). In SMP systems, shared memory can be implemented efficiently in hardware since the processors are symmetric (e.g., identical in construction and in operation) and operate on a single, shared processor bus. Symmetric Multiprocessor systems have good price/performance ratios with four or eight processors. However, because of the specially designed bus that makes message passing between the processors a bottleneck, it is difficult to scale the size of an SMP system beyond twelve or sixteen processors.
It is desired to construct large-scale DSM systems using processors connected by a network. The goal is to allow processors to efficiently share the memories so that data fetched by one program executed on a first processor from memory attached to a second processor is immediately available to all processors.
Caches connected to each processor of the computer system permit faster access to data from the main memory of each computer system. Caches are useful because they reduce memory latencies on cache hits. However, unique to DSM multiprocessing computer systems, the copies of memory locations stored in each computer system cache allow for inconsistent copies to develop if a coherency protocol that enforces cache consistency is not implemented in the computer system. This coherency protocol must typically be designed in such a manner that it scales to very large processor configurations with maximum memory system performance. Prior art systems suffered from performance bottlenecks due to the bus based cache coherence protocols prevalent in such systems. Bus based coherence protocols limit the number of processors that can be incorporated into such a high performance system. Directory based solutions to the problem of cache and memory coherence scale much better to larger systems because they can be efficiently adapted to more arbitrary and larger numbers of processor interconnects.