1. Field of the Invention
This invention relates to a multiprocessor system having a plurality of processors connected to a shared bus and a shared memory through respective private caches. More particularly, this invention relates to such a multiprocessor system adopting a so-called snoopy cache architecture wherein each private cache is provided with a controller which monitors signals on the shared bus and manipulates data in the private cache, for example, for maintaining data consistency among the private caches.
2. Related Art
While there are a number of types of conventional multiprocessor systems, tightly coupled multiprocessor systems are increasingly coming into practical use.
In one such type of tightly coupled multiprocessor system, a plurality of processors read from or write to a shared memory connected to a shared bus. Without private caches, each processor has to accomplish read/write access to the shared memory through the shared bus. Therefore the shared memory is frequently occupied. Thus, in such an environment, an increase in the number of processors can not improve the performance of the system beyond a certain limit.
An approach has been proposed wherein each processor has a private cache in which it keeps a partial copy of data stored in the shared memory. Each processor performs read/write access to the data within its private cache, and thereby the shared bus and memory are not used as frequently. The above described approach is commonly referred to as a multi-cache system. That approach, however, causes a problems in that when each processor modifies shared data in its cache without relation to other sharing processors, the sharing processors may, at any instant in time, have different data at a given address. Means for maintaining consistency of data at a given address in different caches is accordingly needed. Hereinafter, "the consistency of data" means that every processor looks at the same data at a given address.
One method for ensuring the consistency of data is the snoopy cache technique. The snoopy cache technique maintains the consistency of data among caches by having each processor's cache controller monitor the shared bus. That is, when a processor modifies shared data in its cache (i.e. data that is shared by one or more other processors), it sends information about how it modified the data and what address the modified data is at, on the shared bus. The cache controllers of the other processors see that information and update or invalidate the data in their caches to maintain the consistency of data.
Conventional snoopy cache techniques will typically adopt one of two conventional protocols to handle modification of shared data. According to the first conventional protocol, upon modification to shared data at a cache, copies at the other caches are invalidated. According to the second conventional protocol, upon modification to shared data at a cache the copies at the other caches are modified. For example, Dragon ("The Dragon Processor", Proceedings of Second International Conference on ASPLOS, 1987, pp. 65-69, R. R. Atkinson, et. al.) of the Xerox Corporation (USA), and FireFly ("Firefly: a Multiprocessor Workstation", Proceedings of the Second International Conference on ASPLOS, 1987, pp. 164-172, C. P. Thacker et al.) of Digital Equipment Corporation (USA), use the update type. On the other hand, SPUR ("Design Decision in SPUR", IEEE Computer, pp. 8-22, November 1986, M. Hill et al.) of the University of California uses the invalidate type.
The above two types can equally maintain the consistency of data among a plurality of caches. That is, the updating and invalidating of data have the same effect with respect to consistency. However they both have merits and demerits in accordance with their approaches.
The update type is suitable for cases where the data is tightly shared by the processors (or where the processors almost equally access shared data). The invalidate type is not suitable for those cases, because each time a processor modifies a shared data area, the copies in the caches of the other sharing processors are invalidated. Thus, a read/write access to that area by the other sharing processors inevitably causes a cache miss and requires access to the shared bus. In this regard, the update type cache is advantageous since the copies in the sharing caches are updated, thereby enabling the processors to read the data area without accessing to the shared bus. Generally speaking, the update type works well when used for buffers in a parallel program of the producer and consumer model, and semaphore or the like used to synchronize processors, etc.
The invalidate type, on the other hand, is preferably applied to shared data which is exclusively used by one processor or to shared data which is not accessed frequently by the other processors. Paging or process migration may cause data exclusively held by one processor to be considered shared while it should be kept as exclusive. This situation places unnecessary shared data in the system and degrades performance. The invalidate type is effective in that situation.
In light of the strengths and weaknesses of each, a preference between the above types of protocols can not be decided in a straightforward manner. This is because the performance of the system under a given protocol depends on the characteristics of a program to be executed and the operational status of individual processors. Thus far, the use of the above-described conventional protocols has not enabled efficient operations in every data access situation.
The above mentioned Dragon, FireFly and SPUR each provide only one type of protocol, and consequently achieve only degraded performance in some situations. A prototype machine, TOP-1, of the International Business Machines Corporation, can selectively switch between the above mentioned types of protocols by means of software. This, however, still does not resolve the problem of how and when to switch. The resolution of that problem is a key factor in achieving full performance enhancement.