The present invention relates to a cache memory and, more particularly, to a cache memory suitable for use as incorporated in a microprocessor.
The cache memory is smaller than the main memory in storage capacity but faster in access. Therefore, the cache memory is located very close to the central processing unit (CPU) for the purpose of supplying data held in the main memory to the CPU. A variety of problems about the cache memory are discussed in the ACM, Computing Surveys, Vol. 14, No. 3, 1994, pp. 473-530 and xe2x80x9cComputer Organization and Designxe2x80x94The Hardware/Software Interface,xe2x80x9d Morgan Kaufmann Publishers, pp. 454-527, 1994, for example. The main problems of the cache memory are access time and power consumption.
An example of a conventional cache memory of relatively small power consumption is shown in the NIKKEI Electronics, Feb. 14, 1994, pp. 79-92 (this cache memory is hereinafter referred to as the first prior-art technology). FIG. 2 shows a block diagram of the first prior-art technology.
As shown, the cache memory according to the first prior-art technology is a four-way set-associative cache memory. The set-associative memory is provided as follows. Namely, a plurality of areas that can hold data in a size of blocks in the cache memory are divided into a plurality of rows and a plurality of columns. Each of areas in main memory (not shown) that can hold a data block is divided into a plurality of columns corresponding to the above-mentioned plurality of columns. Block storage areas in the same column in main memory are associated with a given block storage area in the cache memory column corresponding to that same column.
To be more specific, as shown in FIG. 2, in the prior-art cache memory, an address array 200 is composed of four memory mats (also called ways) 206 (namely, way 0, way 1, way 2, and way 3), a decoder 205 commonly provided for these ways, and a precharge and equalize circuit 207, a sense amplifier 208, and a comparator 209 provided for each of the ways. Likewise, a data array 201 is composed of four memory mats 218 (namely, way 0, way 1, way 2, and way 3) and an address decoder 217, a precharge and equalize circuit 219, a sense amplifier 220, and an output buffer 221 provided for each of the ways.
The above-mentioned prior-art cache memory operates as follows. First, access to the four ways 206 is started according to a middle address Am entered from a line 204. Addresses registered in the way 0, the way 1, the way 2, and the way 3 are read and are outputted from the sense amplifiers 208 provided for respective ways (these addresses are also referred to as tags). In the comparator 209 provided for each way, an upper address Au entered from a line 210 is compared with the address read from each way. If a match is found, namely if the cache memory has hit, the comparator 209 asserts a corresponding hit line 211, 212, 213 or 214. Conversely, if a mismatch is found, namely if the cache memory has not hit, the comparator 209 leaves the corresponding hit line negated.
Of the four ways of the data array 200, only one way for which the address array 100 has hit, is activated by the corresponding hit line.
Consequently, the above-mentioned prior-art technology is advantageous in power saving. However, the access time of the entire cache memory is a sum of the access time of the address array 200, the time required for the comparison operation in the comparator 209, and the access time of the data array 201, resulting in a relatively large value. This makes it difficult to enhance the operating frequency of the cache memory.
To overcome such a problem, the present inventors considered a method in which the address array is activated at the same time the data array is activated. FIG. 3 shows a block diagram of a four-way set-associative cache memory 3000 that operates in this method (this cache memory is called a reference technology hereinafter). In FIG. 3, the structures of an address array 300 and a data array 301 are generally the same as those of FIG. 2. The difference between the prior-art technology of FIG. 2 and the reference technology of FIG. 3 lies in that, when the address array 300 is activated, the data array 301 is activated at the same time. The data held in an output buffer 321 of one way among the four ways of the data array 301 corresponding to a way in which hit occurred in the address array 300 may only be outputted to a data line 322. In this method, the address array 300 and the data array 301 are accessed simultaneously, so that the access time of the entire cache memory 3000 is approximately equal to the access time of the data array 301. Thus, the access time of the entire cache memory is relatively short. In this method, however, a way in the data array corresponding to a way in which no hit occurred in the address array is also accessed, so that the power consumption of the data array increases significantly. Further, even if the operating frequency of the cache memory is lowered, the data array operates in the same manner as mentioned above, and therefore, the power consumption is not reduced.
The NIKKEI Electronics, Mar. 27, 1995, pp. 13-20 introduces a new RISC (Reduced Instruction Set Computer) processor (a second prior-art technology hereinafter) developed by the assignee hereof and others. Especially, page 16 of the same publication describes a technology for suppressing cache power consumption that follows. Namely, SH7708 employed three methods of suppressing cache power consumption. In the first method, only a way in which hit occurred in the address array is driven. This method was also employed in SH7604, but it is impossible to drive the data array after address array hit determination at high-speed operations, because of the limitation of circuit speed in SH7708. Hence, a circuit constitution for dynamically determining a drive timing of a data array was provided and, if hit determination cannot be made in time, all four ways of the data array are driven. The limit of the frequency for selectively driving one way of the data array is about 40 MHz.
As mentioned above, the cache memory according to the first prior-art technology can operate with somewhat small power consumption but is it difficult to enhance an access speed of this cache memory. The second prior-art technology does not describe how concretely power consumption was reduced.
It is therefore an object of the present invention to provide a cache memory that can operate at a relatively high speed and consumes a somewhat small amount of power at least in a low-speed operation.
It is another object of the present invention to provide a cache memory that can reduce power consumption at a high-speed operation and further reduce power consumption at a low-speed operation.
It is still another object of the present invention to provide a cache memory that can operate at a considerably high frequency, reduce power consumption in an operation at a relatively low frequency, and also reduce power consumption in an operation at a relatively high frequency located between the above-mentioned considerably high and low frequencies.
In attaining the above-mentioned objects, a cache memory according to the present invention has, in addition to a first start circuit for activating an address array in response to a read request which requests readout of data from another memory, a second start circuit for activating a data array after activating the address array. The second start circuit has a start execution circuit for dynamically selecting and executing one of a first start operation for activating the data array before completion of a hit check operation after the start of the address array and a second start operation for activating the data array after the hit check operation completes and it is determined that the address array has hit. As the first start operation realizes a high-speed operation because it does not wait for completion of a hit check operation and the second start operation realizes a low power consumption operation because it activates only a hit way in the data array.
To be more specific, the above-mentioned start execution circuit has a circuit that selectively executes the first and second start operations depending on a clock frequency of a clock signal for controlling the operations of the above-mentioned cache memory. This circuit allows automatic switching between a high-speed operation and a low-speed but a low power consuming operation depending on an operating frequency.
To be further specific, the first and second start circuits respectively activate the address array and the data array in response to a first clock signal and a second clock signal having the same frequency as the first clock signal and delayed behind the first clock signal by a predetermined phase. The phase difference is maintained at a substantially constant level even when the frequencies of these clock signals are changed. This allows the automatic selection between the above-mentioned first and second start operations such that these operations are performed respectively when the frequencies of the clock signals are high and low.
In another mode of a cache memory according to the present invention, an address array and a data array are each divided into a plurality of ways, and in addition to a first start circuit for activating address information holding ways in parallel, a second start circuit is provided for activating a plurality of data holding ways in parallel after activating reading of address information before completing a hit check operation for the address information. In addition, the cache memory has an output control circuit that instructs one data holding way corresponding to one of the address information holding ways to output the data read by that data holding way, if it has been found by the hit check operation in the address array that the one address information holding way has hit array. As the plural ways of the data array are activated before completion of the hit check operation, the data held in the way that has hit can be read quickly after the completion of the hit check operation.
In another mode of the present invention, the above-mentioned second start circuit has a circuit for activating the plurality of data holding ways in the timing in which data are read therefrom after completion of a hit check operation, and the cache memory further has a circuit that instructs those data holding ways which have not hit to stop the data read operation under execution. This novel constitution can immediately stop the operations of the data holding ways that have not hit. Consequently, power saving is realized in the data holding ways that have not hit.
The above and other objects, features and advantages of the present invention will become more apparent from the accompanying drawings, in which like reference numerals are used to identify the same or similar parts in several views.