1. Field of the Invention
The present invention generally relates to computers, and more particularly to a method of controlling a cache memory, and a computer using the cache memory control method.
2. Description of the Related Art
Generally, a control method that effectively controls a cache memory so as to increase the access speed of a computer to the main memory is known. The fundamental principle that makes a cache memory effective is essentially the same principle that constructs virtual memory, namely “locality of reference”. In the control method, the cache memory is used to suitably construct the referential locality.
The referential locality is the foundation of cache (and virtual memory) design. If cache blocks or lines are very close to each other, they can be accessed very quickly. Also, if there arc a small number of cache blocks, it is easy to recognize and access the next cache block needed. If there are a large number of cache blocks, both the access time and the addressing time (i.e., time to find the right one) become much longer and make the task less efficient.
The principle of referential locality concerning a cache memory is ordinarily divided according to the basis of locality into two major categories: spatial locality (locality in space) and temporal locality (locality in time). The former one means that, if cache blocks are very close to each other, one of the cache blocks is subsequently accessed with a high probability after another adjacent cache block was accessed. The latter means that, if a certain cache block is accessed once, the cache block will be accessed two or more times with a high probability.
A description will now be given of several conventional designs of cache computers.
FIG. 1 shows a configuration of a conventional direct-mapped cache computer. In the direct-mapped cache computer, there are two types of cache computer: write-through type and write-back type. In the write-through type, the data of the cache memory is rewritten at the same time as the data of the main memory is renewed. In the write-back type, the data of the cache memory is first rewritten, and subsequently the data contained in the cache blocks is written back to the main memory so that the data of the main memory is renewed.
As shown in FIG. 1, the conventional direct-mapped cache computer generally comprises a CPU 1, a cache memory 3 connected to the CPU 1, and a main memory 5 connected to the cache memory 3. The cache memory 3 includes an address register 7, a comparator 9, a control unit 11, a data register 13, and a data storage portion 10 having cache blocks #0 through #n.
In the cache computer of FIG. 1, the address register 7 is connected to the CPU 1, and this address register 7 holds an address signal supplied by the CPU 1. The data register 13 is connected to both the control unit 10 and the CPU 1, and this data register 13 holds a read data sent to the CPU 1, a writing data sent from the CPU 1, a read data sent from the control unit 11 and a writing data sent to the data storage portion 10.
The data storage portion 10 is constituted by a plurality of cache blocks #0 through #n. Each cache block includes a tag (TAG), a validity flag (V), and a cache block data (DATA). In a case of the write-back type cache computer, each cache block further includes a modification flag (M) as shown in FIG. 1.
The tag contains a subset of main memory address that identifies a cached data of a cache block in the cache memory. The validity flag V is reset to zero (V=0) when the cache block is invalid, and set to one (V=1) when the cache block is valid. The modification flag M, in the case of the write-back type, is reset to zero (M=0) when the cache block is not written back to the main memory (non-replacement or non-modification), and set to one (M=1) when the cache block is written back to the main memory (replacement or modification).
The cache blocks #0 through #n of the data storage portion 10 retain respective data blocks of a cache block data supplied from the main memory 5. The comparator 9 compares the address signal, supplied by the CPU 1 via the address register 7, with the tag (TAG) of the data storage portion 10. When the validity flag V of the cache block is equal to zero (V=0), it is always determined that the address signal does not match with the tag of that cache block.
In the cache computer of FIG. 1, the control unit 11 controls an operation of the cache memory 3 to read an instruction sent by the CPU 1.
FIG. 2 shows a configuration of the control unit 11 in the conventional direct-mapped cache computer of FIG. 1.
As shown in FIG. 2, the control unit 11 generally includes a decoder 30, a read control unit (RD CNTL) 31, and a write control unit (WR CNTL) 32. The decoder 30 decodes an instruction signal supplied by the CPU 1. The write control unit 32 is connected to the decoder 30 and controls the writing of data to the data storage portion 10 and the main memory 5. The read control unit 31 is connected to the decoder 30 and controls the reading of data from the data storage portion 10 and the main memory 5.
In the control unit 11 of FIG. 2, when the reading of data is performed by the CPU 1, the comparator 9 compares the address signal, supplied by the CPU 1, with the tag TAG of each of the cache blocks of the cache memory 3, the validity flag (V) of which is set to one. When a match between the address and the tag of the data storage portion 10 (or a cache hit) takes place, the comparator 9 outputs a signal indicating the match, to the read control unit 31 and the write control unit 32. In accordance with the block address supplied by the address register 7, the read control unit 31 reads out the cache block data (DATA) from the data storage portion 10. The read control unit 31 sends the data (DATA), read from the data storage portion 10, to the CPU 1 via the data register 13.
When a cache miss between the address and the tag of the data storage portion 10 takes place (or there is no match), the comparator 9 outputs a signal indicating the non-match, to the read control unit 31 and the write control unit 32. A location of the main memory 5 for replacement is determined by the read control unit 31 and the write control unit 32 by using the block address supplied by the address register 7. The read control unit 31 reads out the data (DATA) from the location of the main memory 5. The write control unit 32 writes the read data (DATA) to the cache block of the data storage portion 10, and sets the validity flag V of the cache block.
In the case of the write-back type cache computer, when the modification flag M of the cache block at that time is set to one, the read control unit 31 writes the cache block data (DATA) of that cache block of the data storage portion 10 back to the main memory 5. Thereafter, the modification flag M is reset to zero, and a new data is written to that cache block of the data storage portion 10. Further, the new data that is written to the cache block of the cache memory 3 is supplied to the CPU 1.
Further, in the control unit 11 of FIG. 2, when the writing of data is performed by the CPU 1, the comparator 9 compares the address signal, supplied by the CPU 1, with the tag TAG of each of the cache blocks of the cache memory 3, the validity flag (V) of which is set to one. When a match between the address and the tag of the data storage portion 10 (or a cache hit) takes place, the write control unit 32 writes the writing data, supplied by the CPU 1, to the cache block of the data storage portion 10 in accordance with the block address supplied by the address register 7. In the case of the write-back type cache computer, the modification flag M of the cache block is set to one at that time.
On the other than, when a cache miss between the address and the tag of the data storage portion 10 takes place (or there is no match), the write control unit 32 writes the writing data, supplied by the CPU 1, to the main memory 5.
Next, FIG. 3 shows a configuration of a conventional fully associative cache computer. Similar to the direct-mapped cache computer, in the fully associative cache computer, there are also two types of computer design: the write-through design and the write-back design.
As shown in FIG. 3, the conventional fully associative computer generally comprises a CPU 1, a cache memory 3A connected to the CPU 1, and a main memory 5 connected to the cache memory 3A. The cache memory 3A includes an address register 14, a plurality of comparators 15, 17, 19 and 20, a control unit 23, a data register 21, and a plurality of cache blocks #0 through #n.
FIG. 4 shows a configuration of the control unit 23 in the conventional fully associative cache computer of FIG. 3.
As shown in FIG. 4, the control unit 23 generally includes a block address calculating unit (BL ADDR CALC) 33A, an OR gate 34A, a decoder 30A, a read control unit (RD CNTL) 31A, and a write control unit (WR CNTL) 32A. The block address calculating unit 33A is connected to the respective comparators 15, 17, 19 and 20, and the values of the validity flags (V) of the cache blocks #0 through #n are supplied from the respective comparators 15, 17, 19 and 20 to the block address calculating unit 33A. Further, the values of the validity flags (V) of the cache blocks #0 through #n are supplied to the OR gate 34A.
The decoder 30A is connected to the CPU 1 and decodes an instruction signal supplied by the CPU 1. The write control unit 32A is connected to the decoder 30A, the block address calculating unit 33A and the OR gate 34A, and controls the writing of data to the cache blocks #0 through #n and to the main memory 5. The read control unit 31A is connected to the decoder 30A, the block address calculating unit 33A and the Or gate 34A, and controls the reading of data from the cache blocks #0 through #n and from the main memory 5.
The read/write operations of the control unit 23 of FIG. 4 are similar to those of the control unit 11 of FIG. 2. In the control unit 23 of FIG. 4, the read control unit 31A and the write control unit 32A are configured to select any of the cache blocks #0 through #n that are subjected to the reading or the writing, in accordance with both the cache block address signal (CBA) supplied by the block address calculating unit 33A and the signal supplied by the OR gate 34A.
When a cache miss between the address signal and the tags of the cache blocks #0 through #n takes place (or there is no match) during the writing operation of the CPU 1, the location of the cache blocks #0 through #n for replacement is determined by the read control unit 31A and the write control unit 32A by using the block address supplied by the address register 14. The validity flag V of the cache block determined is set to one. The write control unit 32A writes the writing data, supplied by the CPI 1, to both the determined cache block of the cache memory 3A and the main memory 5.
In the case of the write-back type cache computer, when the modification flag (M) of the cache block is set to one during the reading operation of the CPU 1, the read control unit 31 writes the cache block data (DATA) of that cache block of the cache memory 3A back to the main memory 5. Thereafter, the modification flag (M) is reset to zero, and a new data is written to that cache block of the cache memory 3A.
FIG. 5 shows a configuration of a conventional 2-way set-associative cache computer.
Similar to the cache computer of FIG. 1, the conventional 2-way set-associative cache computer of FIG. 5 generally comprises a CPU 1, a cache memory 3B connected to the CPU 1, and a main memory 5 connected to the cache memory 3B. The cache memory 3B includes an address register 14, a comparator 9A, a comparator 9B, a control unit 23B, a data register 21, a pair of data storage portions 10A and 10B (the two ways) each including a plurality of cache blocks #0 through #n.
FIG. 6 shows a configuration of the control unit 23B in the conventional 2-way set-associative cache computer of FIG. 5.
As shown in FIG. 6, the control unit 23B generally includes a block address calculating unit (BL ADDR CALC) 33B, an OR gate 34B, a decoder 30B, a read control unit (RD CNTL) 31B, and a write control unit (WR CNTL) 32B. The block address calculating unit 33B is connected to both the comparators 9A and 9B, and the values of the validity flags (V) of the cache blocks #0 through #n of each of the data storage portions 10A and 10B are supplied from the comparators 9A and 9B to the block address calculating unit 33B. Further, the values of the validity flags (V) of the cache blocks #0 through #n of each of the data storage portions 10A and 10B are supplied to the OR gate 34B.
The decoder 30B is connected to the CPU 1 and decodes an instruction signal supplied by the CPU 1. The write control unit 32B is connected to the decoder 30B, the block address calculating unit 33B and the OR gate 34B, and controls the writing of data to the cache blocks #0 through #n of each of the data storage portions 10A and 10B and to the main memory 5. The read control unit 31B is connected to the decoder 30B, the block address calculating unit 33B and the Or gate 34B, and controls the reading of data from the cache blocks #0 through #n of each of the data storage portions 10A and 10B and from the main memory 5.
The read/write operations of the control unit 23B of FIG. 6 are similar to those of the control unit 11 of FIG. 2. In the control unit 23B of FIG. 6, the read control unit 31B and the write control unit 32B are configured to select any of the cache blocks #0 through #n that are subjected to the reading or the writing, in accordance with both the cache block address signal (CBA) supplied by the block address calculating unit 33B and the signal supplied by the OR gate 34B.
In the case of the write-back type cache computer, when the writing of data is performed by the CPU 1, the write control unit 32B writes the writing data, supplied by the CPU 1, to only the cache block of the cache memory 3B for replacement. In the case of the write-back type cache computer, only the cache memory 3B is first written to. Thereafter, the data of the cache blocks in each of the data storage portions 10A and 10B for replacement are written back to the main memory 5 so that the main memory 5 is renewed.
The above-described cache computers are designed to increase the access speed of the CPU 1 to the main memory 5 by using the cache memory. However, when a program of asynchronous data processing, such as external interrupt processing, or a program of multimedia processing that requires the data processing of various types of signals is executed by the CPU 1, it is not expected that the above-described cache computers achieve an adequate level of the referential locality by using the cache memory. It is difficult for the above-described cache computers to sufficiently increase the access speed of the CPU 1 to the main memory 5 when the program of asynchronous data processing or the program of multimedia processing is executed by the CPU 1.