1. Field of the Invention
The present invention relates to a hierarchical cache memory apparatus suitably assembled in a multiprocessor system including a plurality of processors and memory devices.
2. Description of the Related Art
The following references for further details of the related arts are available:
(1) UCB/CSD TR #84/199.pp.1-89 "Design and Implementation of An Integrated Snooping Data Cache"; Gaetano Borriello et al; September 1984. PA0 (2) The 14th I.S.C.A. pp.244-252, "Hierarchical Cache/Bus Architecture for Shared memory Multiprocessors" Andrew W. Wilson Jr. Jun. 2, 1987. PA0 (3) U.S. Pat. No. 4,755,930 Jul. 5, 1988. Andrew W. Wilson Jr., Steven J. Frank.
Recently, a variety of tightly coupled multiprocessor systems in each of which a main memory device is shared by a plurality of processors have been developed. A multiprocessor system of this type is normally constituted by connecting a plurality of processors and a main memory device through a common bus. However, when a common bus is used, all the communication among the processors and the main memory device are performed through the common bus. Therefore, competition for use of the bus caused by accesses to the main memory device poses a serious problem. As a result, a further improvement of the system performance cannot be expected.
In view of the above situation, it is proposed to arrange a cache memory device between the plurality of processors and the main memory device. The cache memory device can decrease a difference between processing speeds of the processors and the main memory device, so that an access speed to the main memory device seems to be apparently high. When a cache memory device of this type is used, low-speed access is executed between the cache memory device and each main memory device, and a data width can be increased. Meanwhile, since highspeed access is performed between the cache memory device and the plurality of processors, a data width is adjusted. As a result, the multiprocessor system of this type can be operated without impairing the original performance of the processors.
Furthermore, data (commands, operands, data, and the like) accessed by a processor are temporarily stored in the cache memory device. When the same data is accessed, the corresponding data can be accessed at high speed not from the main memory device but from the cache memory device. More specifically, with this system, the cache memory can consequently decrease the amount of accessing the main memory device.
When a cache memory device of this type is adopted in a tightly coupled multiprocessor system, assuming that a cache memory device is shared by a plurality of processors, the amount of data transfer between the cache memory device and the main memory device can be decreased. However, the amount of accessing produced when the plurality of processors access the cache memory device is the same as that of a system without the cache memory device. Therefore, the system performance cannot be further improved unless the memory size of the cache memory device is increased, a bus cycle time between the processors and the cache memory device is shortened, and a bus width is expanded.
When a cache memory device is arranged, for each processor, the amount of data transfer between the plurality of cache memory devices to the main memory device can be decreased. In addition, since access between the processors and the cache memory devices can be executed in a one-to-one correspondence, competition for use of the bus caused when one cache memory device is shared by a plurality of processors never occurs. However, since the cache memory devices are arranged in correspondence with the processors, a problem of needing a control system is posed introduced in order to assure consistency among these cache memory devices.
The "problem of consistency" among the cache memory devices occurs since the cache memory devices corresponding to the processors have copies of a content at a specific memory address of the main memory device. When the cache memory devices are used, it is very important that these contents copied from the main memory device are the same.
For example, when a copied content of a specific memory address of the main memory device held in the corresponding cache memory device by a certain processor is to be updated, the copied content of the same memory address held in other cache memory devices must be similarly updated. Therefore, when the copied content held in a certain cache memory device is updated, processing for updating the content at the corresponding memory address of the main memory device and the corresponding copied contents held in all other cache memory devices must be executed. Alternatively, it is necessary that the copied content held in a certain cache memory device is updated and the content at the corresponding memory address of the main memory device is updated, while the corresponding copied content in all other cache memory devices is invalidated (erased or deleted).
An actual control system of cache memory devices is roughly classified into two systems. One system is called a store-through or write-through system. In this system, simultaneously with write access to a cache memory device, the same data is written in the main memory device. The other system is called a store-in, write-in, write-back, or copy-back system. In this system, data is written in only a cache memory device, and write access to the main memory device is executed when the corresponding cache block is to be replaced.
A single processor system preferably employs the store-in system to decrease the amount of activity on a common bus due to a main memory device since the following fact is theoretically apparent. More specifically, as the memory size of a cache memory device approaches infinity, the driving number of cache blocks required to be replaced decrease. Therefore the amount of activity on the common bus due to the main memory device approaches 0.
In contrast to this, when a multiprocessor system in which a main memory device is shared by a plurality of cache memory devices similarly employs the store-through system, every time a processor rewrites a copied content of the main memory device stored in a certain cache memory device, the same data must be written in the main memory device through the common bus connecting the cache memory device and the main memory device. Furthermore, all the cache memory devices monitor activity on the common bus, and when data on the common bus includes information of a specific memory address held in the corresponding cache memory device, the content must be invalidated. For the above-mentioned reasons, when the processor tries to read the copied content invalidated as a result of monitoring of the shared bus, the cache memory device must copy the same content from the main memory device again.
When a multiprocessor system in which a main memory device is shared by a plurality of cache memory devices similarly employs the store-in system, the number of times of access to the main memory device can be smaller than that in the store-through system, as described in related references. However, it is impossible to maintain consistency of storage contents among a plurality of cache memory devices by the same control system as that employed by a single processor system.
Recently, in order to efficiently connect a larger number of processors, to decrease a traffic volume of a shared bus, and to minimize a speed difference between processors and a main memory device, a plurality of cache memory devices are hierarchically arranged to improve the system performance.
In consideration of the above situation, even when a plurality of cache memory devices are hierarchically arranged, not the store-in system with high efficiency but the store-through system must be selected in order to keep consistency of storage contents among cache memory devices. In other words, when the store-through system is employed, consistency among the cache memory devices can be maintained. However, in this case, every time write access to a plurality of cache memory devices is executed, the same write access is executed for the main memory device, and extra read access to the main memory device caused by invalidation frequently occurs. As a result, the information processing efficiency of the system is inevitably impaired.
As described above, in the conventional system, when a multiprocessor system is constituted by using hierarchical cache memory devices, there is no other choice but to select the store-through system in order to maintain consistency of storage contents among the cache memory devices. When the store-through system is employed, the amount of activity on the shared bus is increased, and it is difficult to sufficiently utilize an original performance of a processor.