1. Field of the Invention
The present invention relates to a cache memory system for improving a processing speed in a computer processing system.
2. Description of the Background Art
A conventional cache memory system typically has a configuration shown in FIG. 1, which has only one cache memory unit including a cache tag memory unit 125 and a cache data memory unit 126 with respect to a plurality (two in this case) of processing units including an integer unit (IU) 121 having an associated register 123 and a floating point unit (FPU) 122 having an associated register 124. This configuration of FIG. 1 is implemented on a single chip 127, so that it is quite simple and efficient for ordinary processing operations. However, when the processing units are capable of executing a plurality of memory access instructions simultaneously, the cache memory unit must have a plurality of cache memory ports, and this in turn increases an area of the cache memory unit considerably.
On the other hand, in a case of using multiple processing units in which the cache memory unit cannot be accommodated on a single chip as shown in FIG. 2, it is necessary to realize the cache memory unit in a form of a cache tag memory unit 130 and a cache data memory unit 131 which are separately implemented on different chips (chip 3 and chip 4) provided along chip 1 and chip 2 on which an 35 integer unit (IU) 128 and a floating point unit (FPU) 129 are separately implemented. However, such a manner of implementation of multiple processing units in turn gives rise to a problem of a lowering of a processing speed in the processing units.
This problem concerning the lowering of the processing speed itself can be resolved by using a configuration as shown in FIG. 3, where an integer unit 141 has its own cache memory unit comprising an IU cache tag memory unit 143 and an IU cache data memory unit 144 which are implemented on an IU chip 148 together with the IU 141, and a floating point unit 142 has its own cache memory unit comprising a FPU cache tag memory unit 145 and a FPU cache data memory unit 146 which are implemented on an FLU chip 149 together with the FLU 142, while a main memory 147 is commonly provided with respect to the IU chip 148 and the FPU chip 149. However, in such a configuration, it is quite complicated to maintain the consistency between two cache memory units. In addition, in such a configuration, it becomes necessary to hand over the full address of the memory access instruction between these two cache memory units, and this in turn requires a large number of connection wirings between the IU chip 148 and the FPU chip 149.
Thus, a conventional single chip cache memory system has been associated with a problem of an increase of the cache memory ports to be provided on the cache memory system, especially when the cache memory unit is associated with a relatively large scale processing system capable of simultaneous execution of a plurality of instructions such as a super scalar system, whereas a conventional multi-chip cache memory system has been associated with either a problem of a lowering of a processing speed or a problem of complicated consistency control and large inter-chip connections.
Now, on the other hand, in a conventional cache memory system having only one processing unit, as shown in the timing chart of FIG. 4, the operation of the processing unit (CPU) is interrupted whenever a cache miss occurs, in order to carry out a refilling operation for refilling the missing data into the cache data memory unit, and the operation of the CPU is not resumed until such a refilling operation is completed. Here, because there are many cases which subsequently require not Just the missing data alone but also the other data at addresses close to the address of the missing data, the refilling operation is usually executed over a plurality of clock cycles in order to store a plurality of consecutive data including the missing data into the cache data memory unit.
A conventional cache memory system using such a refilling operation typically has a configuration shown in FIG. 5, which comprises: a CPU 151, a first cache tag memory unit 152, a first cache data memory unit 153, a refilling control unit 154, and a second cache memory unit 155. In this configuration of FIG. 5, in reading the data from the cache memory system, the CPU 151 makes accesses to the first cache tag memory unit 152 and the first cache data memory unit 153 by specifying the desired address of the data to be read out. In a case of a cache hit at the first cache tag memory unit 152 and the first cache data memory unit 153, the desired data at the specified address can be read out from the first cache data memory unit 153 to the CPU 151. On the other hand, in a case of a cache miss at the first cache tag memory unit 152 and the first cache data memory unit 153, the refilling control unit 154 makes an access to the second cache memory unit 155 in order to execute the refilling of a series of consecutive data starting from the missing data into the first cache data memory unit 153.
This procedure for the refilling operation works effectively when the cache miss rate is relatively low. Namely, the effective memory access time Ne in such a conventional cache memory system, which represents the performance of the cache memory system, can be expressed as follows. EQU Ne=(1-Km).multidot.Ca+Km.multidot.Cc
where Km is a cache miss rate of a cache memory unit which depends on a size of the cache memory unit and other factors, Ca is a memory access time for a case of a cache hit, Cc is a time required for refilling the data into the cache memory unit. Using a number of clock cycles for expressing time, usually, Ca takes a value of about 1, Cc takes a value of about 10, and Km takes a value of about 1% to 5%, so that the resultant Ne takes a value of about 1.1 to 1.5. This value for Ne is sufficiently small compared with a value for a case of not using a cache memory system which is about 2.0 to 3.0, and this fact indicates the effectiveness of the cache memory system.
However, there are cases in which the cache miss rate becomes quite large. In such a case, the effective memory access time can be so large that the cache memory unit is not just ineffective but can be even harmful to the overall operation of the cache memory system. A number of such cases associated with excessively large effective memory access time are increasing in recent years, as there is a tendency to limit a size of the cache memory unit in order to make a micro-processor containing the cache memory unit within itself, and as there are increasing number of cases in which an amount of data to be dealt with by the cache memory unit is very large.
Thus, in a conventional cache memory system, the increase of the cache miss rate is a direct cause of the increase of the effective memory access time realizable in the cache memory system, such that the use of the cache memory system has conventionally been ineffective for a huge scale task, if not harmful, and this problem has been hampering the further improvement of a performance of a single chip micro-processor incorporating a cache memory system within itself.
Moreover, in a case of using the cache memory system for an instruction cache, even though it is possible to alleviate the increase of the cache miss rate to some extent by refilling a prescribed number of consecutive data collectively at a single refilling operation, there still remains the problem concerning the interruption of the operation of the processing unit during the execution of the refilling operation because the operation of the processing unit cannot be continued until the next instruction is fetched, and this problem inevitably limits the level of a performance of the cache memory system.