1. Field of the Invention
The present invention relates to a cache coherence control in a tightly coupled multi-processor system in which a plurality of processors and memories are coupled by data and address bus.
2. Description of the Background Art
A tightly coupled multi-processor system, in which a plurality of processors and memories are coupled by data and address buses, is known to have a superior expansion capability. However, there is also a problem in such a system that as the number of processors increases the improvement of system throughput becomes difficult because of the increasing bus traffic.
In order to improve the system throughput by reducing the bus transfer, it is therefore necessary to reduce memory access frequency and to improve bus throughput.
As a method of reducing the memory access frequency, it has been proposed to provide a cache memory for each processor and to control the cache memories by copy-back type control.
An example of such a tightly coupled multi-processor system is shown in FIG. 1, where the system comprises processors 1 and 2 including CPUs 3 and 4, respectively, and cache memories 5 and 6, respectively; a shared memory 7; and a bus 8 for connecting the processors 1 and 2 and the shared memory 7.
In this type of the system which incorporates the cache memories 5 and 6, there is a need for maintaining the data coherence among the cache memories 5 and 6 and the shared memory 7, which is furnished in the form of a cache coherence protocol.
Conventionally, various propositions have been made for such a cache coherence protocol using copy-back type cache memories. The conventional cache coherence protocols are summarized by James Archibald and Jean-Loup Baer in "Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model", ACM Transactions on Computer Systems, Vol, 4, No. 4, November 1986, pp. 273-298.
The Write-Once protocol is one example of a conventional cache coherence protocol. A state transition diagram for the Write-Once protocol is shown in FIG. 2, where updating of the cache memory data takes place in a unit of a data block, and I, S, P, and P' represent four states of the cache memory data block to be described in detail below, while a solid line indicates a processor-based transition, i.e., a transition triggered by access made by the respective CPU and the state of the respective cache memory data block, and a dashed line indicates a bus induced transition, i.e., a transition triggered by access through the bus due to access made by another CPU and the state of another cache memory data block. In this Write-Once protocol, data coherence among the cache memories and shared memory is maintained by managing cache memory data by using the following four block states:
(1) I: invalid PA1 (2) P: the data of the block do not exist in the cache memory of the other processor, and the data coincide with those in the shared memory; PA1 (3) P': the data of the block do not exist in the cache memory of the other processor, and the data do not coincide with those in the shared memory; and PA1 (4) S: the data of the block exist in the cache memory of the other processor, and the data coincide with those in the shared memory. PA1 (1) split transfer method: a method in which the bus is released between the data read command (address transfer) cycle and the data transfer cycle; and PA1 (2) interlock transfer method: a method in which the bus is locked between the data read command (address transfer) cycle and the data transfer cycle.
On the other hand, as a method of bus control, two types of bus control include:
Compared with the interlock transfer method, the split transfer method requires more complicated controlling but is capable of obtaining higher throughput so that it is more suitable for the system bus in the tightly coupled multi-processor system.
The conventional data coherence protocols such as the Write-Once protocol described above can operate perfectly well when the interlock transfer bus is used. However, when the split transfer bus is used, inconsistency arises in the conventional data coherence protocols.
Namely, when the split transfer bus is used along with the conventional data coherence protocols, the situation depicted in FIG. 3 may arise as follows. First, at (1) the data block required by the CPU of processor #i does not exist in the cache memory of processor #i, so that processor #i makes a block read command to the shared memory, i.e., address transfer from processor #i to the system bus takes place. Then, at (2) processor #i changes the state of its cache memory from I to S, assuming that the block exists in the cache memory of another processor #j. After that, at (3) processor #j makes a write access to the same address. Here, because the state of the cache memory of processor #j is S, processor #j transfers the write address and data to the bus. Next, at (4) processor #i monitors the bus and recognizes the write address by processor #j, so that the state of the cache memory of the processor #i is changed from S to I again, according to the cache coherence protocol. Then after that at (5), the data block from the shared memory is transferred to processor #i. Because the state of the cache memory of the processor #i is I, however, the transferred data block cannot be registered into the cache memory of processor #i, and inconsistency in the cache management occurs.
In this manner, the conventional data coherence protocol generates an inconsistency when access to a certain address from another processor is generated during a period between the address transfer cycle and the data transfer cycle related to that certain address.