1. Field of the Invention
The present invention relates to a cache memory organized as a set-associative cache.
2. Description of the Related Art
Main memory access times are often the limiting factor in attempts to increase processing speed in a microprocessor. Generally speaking, a main memory has a large capacity, but a relatively low speed. The memory access speed if often increased through the use of a device known as a cache memory.
FIG. 1 is a block diagram showing a general-purpose microprocessor which uses a cache memory. Reference numeral 11 denotes a data processor (CPU), 12 a main memory, and 13 a cache memory.
The cache memory is a buffer memory which operates at high speed and prestores data from part of main memory 12. Where a data access is in response to a request from the data processor and the data to be accessed is stored in cache memory 13, the data access is carried out not using main memory 12, but using cache memory 13. Accordingly, data accesses are relatively quicker when cache memory 13 contains the desired data.
Furthermore, the data which can be stored in cache memory 13 is not limited, but data from an optional area of main memory 12 can be stored in response to a request from data processor 11.
There are various configurations for a cache memory, one of which is the four-way, set-associative cache memory.
A description, referring to FIG. 2, will now be given for a four-way, set-associative cache memory system described in a publication entitled "An Outline and Way to Put to Practical Use of One Chip Cache Memory .mu.PD43608R", (Interface, No. 123, August 1987, p. 241-57).
In FIG. 2, reference numeral 1 shows an address (requested address) in main memory 12 of data to be accessed which is requested by data processor 11. The address comprises, from high to low order bits, an address tag 1a, a congruence class select field 1b, and a word select field 1c.
Reference numeral 2 denotes an address tag memory comprised of M entries. Address tag memory 2 stores an address tag 1a of a requested address 1 in each entry. Four address tag memories 2 are provided, corresponding to the four "ways", or congruence sets, of the cache.
Reference numeral 3 denotes a valid bit field memory comprising M entries, each valid bit field corresponding to an address tag memory entry. A valid bit field contains data indicating the validity or invalidity of data stored in a corresponding address tag memory 2 entry. Four valid bit field memories 3 are provided, corresponding to the four congruence sets of the cache.
Reference numeral 4 denotes a data memory comprising M congruence classes, each congruence class containing data from an area of main memory 12 indicated by address tag 1a stored in a corresponding address tag memory 2. Four data memories 4 are provided, corresponding to each of the four congruence sets of the cache.
The area of main memory 12 indicated by address tag 1a usually comprises a plurality of addresses, and a plurality of data are stored in each congruence class memory of each data memory 4. In this example, four words of data are stored for each congruence class.
Reference numeral 5 denotes a word selector, which selects one word from four words output by data memory 4 from the entry of data memory 4 selected by a word select field 1c from a requested address 1. Word selector 5 outputs the selected word to a congruence set selector 9. Four word selectors 5 are provided, corresponding to each of the four congruence sets of the cache.
Congruence set selector 9 selects one word from the words output from the four word selectors 5, and outputs the selected word to a data bus 10.
An address tag comparator 8 compares an address tag 1a of the requested address 1 with the data (address tag) stored in each congruence set entry of the address tag memory 2 for the selected congruence class. Where the comparison finds a matching address tag, address tag comparator 8 outputs a HIT signal. Four address tag comparators 8 are provided, corresponding to the four address tag memories 2. A congruence set selection signal is output to congruence set selector 9 from address tag comparator 8 indicating which comparison result was matched.
The operation of the cache memory of such a conventional four-way, set-associative system is as follows.
In the requested address 1, address tag 1a is input to address tag comparator 8, congruence class select field 1b is input to address tag memory 2 and to data memory 4, and word select field 1c is input to word selector 5.
An address tag of an address of data requested by data processor 11 is contained in each entry of address tag memory 2 and a data block of four words read from that address in main memory 12 is stored in data memory 4.
An access of the address tag memory 2 and the data memory 4 is carried out by specifying the address of the entry in the congruence class select field 1b. In other words, the address tag stored in the address tag memory 2 entry selected by congruence class select field 1b is passed to address tag comparator 8, and the data memory 4 entry selected by congruence class select field 1b is passed to word selector 5.
Word selector 5 selects a word from the four words stored in the selected data memory 4 entry according to the word select field 1c, and outputs the word to congruence set selector 9. Address tag comparators 8 compare the address tag 1a of the requested address 1 being requested by the data processor 11 with the address tags read from the selected entries of address tag memory 2, and judges whether they match or not. If the tags match, address tag comparators 8 output a HIT signal and a congruence set selection signal to congruence set selector 9.
As aforementioned, since cache memory 13 is a four-way, set-associative cache system, it includes four address tag memories 2, four data memories 4, four address tag comparators 8, four word selectors 5, and four congruence set selectors 9, corresponding to the four congruence sets. Accordingly, address tag memory 2 and data memory 4 are respectively capable of storing a maximum of four address tags and four data blocks for each congruence class. Moreover, as each congruence set operates similarly in parallel, when address tag comparator 8 finds a match (coincidence of address tags), four address tags are addressed at the same time by one address tag entry address. In other words, address tag comparators 8 compare the address tag of each congruence set read from address tag memory 2 with address tag 1a of the requested address 1 requested by data processor 11 at the same time, to detect a cache hit/miss.
On the other hand, one word is selected by word selector 5 from an entry of each congruence set of data memory 4, and congruence set selector 9 outputs one word to data bus 10 according to a congruence set selection signal provided by the address tag comparator 8.
FIG. 3 is a circuit diagram showing a bit cell of data memory 4. In FIG. 3, numeral 14 denotes a word line, 15 a memory cell, and 6 a sense amplifier,
One memory cell 15 is connected to one sense amplifier 6 for each entry in data memories 4. For the example of FIG. 2, M memory cells 15 are connected to one sense amplifier 6. The number required of these circuits shown in FIG. 3, which have M memory cells 15 connected to one sense amplifier 6, is the number of bits in one word multiplied by the number of words in a congruence class of data memory 4, and the number of congruence sets in data memory 4.
Accordingly, assuming 32-bit words, and four words per congruence class of data memory 4, for cache memory 13 shown in FIG. 2, the number of sense amplifiers required is 32 (# of bits) .times.4 (words/congruence class).times.4 (congruence sets), or 512.
In the conventional cache memory, as each bit of each word of each congruence set is provided with a sense amplifier as mentioned above, a large area is required on an integrated circuit chip to implement the sense amplifiers. Therefore, as the capacity of the cache memory increases, so does chip area. Conversely, if available chip area is limited, the capacity of the cache memory must decrease. Also, in a conventional cache memory, data is read out from data memories 4 corresponding to each of the four congruence sets and only data from one of the ways is selected by congruence set selector 9, thus using more current than necessary.