1. Field of the Invention
The present invention relates to a multi-processor information processing system, which is constituted by a plurality of main memories and a plurality of processors (central processing units (CPU), microprocessor unit (MPU), direct memory access (DMA), controller and the like) connected through a crossbar switching network to the plurality of main memories so as to share them, and, more particularly, to a cache control system for maintaining data consistency (or coherence) between caches of the processors and the main memories.
2. Description of the Related Art
In an information processing system adopting a main memory sharing type multi-processor scheme, processors are connected to a main memory through a single system bus.
The processing speed of each processor is normally higher than the read/write access speed of the main memory. For this reason, in order to compensate for the speed difference and to increase the speed for accessing desired data, each processor often incorporates a cache (cache memory).
Although the cache size is smaller than the main memory size, since the cache can operate at a speed equal to or slightly slower than that of the processor, deterioration of system performance due to overhead occurring when the processor accesses the low-speed main memory can be prevented by utilizing the cache.
When the number of processors is increased, or when the performance of each processor is improved and the frequency of accessing the main memory is increased, the transfer capability of the bus becomes a bottleneck for the system performance. On the contrary, if a decrease in access speed for the main memories poses a bottleneck problem more serious than that caused by the bus, the main memories may be distributed and connected to the system bus, thus eliminating the bottleneck problem caused by the main memories. However, the bottleneck caused by the bus cannot yet be eliminated.
As a method of solving this problem, the processors and the main memories may be properly connected through switches.
In this arrangement, communications between the plurality of main memories and the plurality of processors can be processed in parallel unless requested addresses of the main memories overlap each other, and the performance of the processors can be sufficiently utilized. For this reason, the bus bottleneck can be eliminated, and a high system throughput can be realized.
However, in order to realize the arrangement for performing the above-mentioned parallel processing, a difficult problem still remains unsolved. This problem is to realize consistency (coherence) or matching between the contents of the caches and the contents of the corresponding main memories. That is, in order to attain cache consistency, main memory updating operations of other processors must be monitored by each processor. However, it is not easy to check all the main memory updating operations which are performed in parallel like in the above-mentioned arrangement.
As a method of monitoring main memory updating operations, all the processors are connected, by means in addition to the crossbar connection, to exchange main memory updating information with each other. With this arrangement, the processors can update their caches. However, this method considerably increases connection hardware resources, and is difficult to apply.
As described above, in a multi-processor information processing system constituted by a plurality of main memories and a plurality of processors, which are connected to each other through a crossbar switching network, in order to attain cache consistency, all the processors must be connected, by means in addition to crossbar connection, to exchange main memory updating information with each other, so that the processors can update their caches. This results in a considerable increase in the number of connection hardware resources.
For this reason, a crossbar connection system is difficult to put into practical application since it cannot easily solve the problem with respect to cache consistency, although it basically has a feature of sufficiently utilizing the performance of the multi-processor system.