The present invention relates to a cache device, more particularly to a combined cache with main memory and a control method thereof, in which an internal cache of a microprocessor is used for both caching function and main memory function.
A large capacity of cache is provided within a high performance microprocessor in order to improve the performance of the processor. As is well known, cache is a storage medium for storing a data to be frequently used, in which the data is previously read out from another storage medium having relatively lower access speed (i.e. main memory). Since the access to the cache is faster than that to the main memory, a required data can be quickly transferred to the microprocessor, thereby improving the performance of the system.
Most of computer systems use as a main memory a DRAM (Dynamic Random Access Memory) which is cheap and can be easily used for implementing large quantity of main memory. Alternatively, as a cache is used a SRAM (Static Random Access Memory), which is faster in access speed though the cost is high. Due to the development of personal computers, 16 kilobytes of internal cache is embedded into Pentium processor, while 512 kilobytes of SRAM cache are further embedded into Pentium Pro processor other than 16 kilobytes of internal cache.
`Cache hit` represents the case that the data required can be fetched from a cache without access to a main memory because the data is stored in the cache, while `cache miss` represents the case that the data required should be fetched from the main memory because the data is not stored in the cache. In the case of cache miss, if there is any vacant space in the cache, some parts located nearby the currently accessed data are read out from the main memory to be stored into the cache, considering their locality. Alternatively, if there is no vacant space in the cache, a replacement algorithm can be used for selecting a needless data block from the cache, and the selected block of the cache is then used for storing the data block related to the currently accessed data from the main memory.
As cache memory, there is included a fully associative mapped cache, a directly mapped cache, an N-way cache and so on. The fully associative mapped cache uses the entire address as tag so that the entire address of data is stored into a tag register of cache when the data is loaded into the cache. The directly mapped cache is proposed for improving the hardware complexity of the fully associative mapped cache, in which the entire address of data can be divided into 2 fields: a tag field and an offset field, and used for the control of the cache. Also, the N-way cache that increases data storage by n-times is proposed for increasing the cache hit rate.
FIG. 1 is a diagram for schematically illustrating the address bus of the directly mapped cache and the N-way cache, in which the address bus can be divided into a tag field and an offset field, for the control of the directly mapped cache and the N-way cache.
Here, during data loading, the directly mapped cache stores the tag field of address into the tag register of the cache. During data reading, the directly mapped cache responds to the offset field to determine if the data can be read out from which data register within the cache.
The N-way cache stores the tag field of the address into the tag register of the cache during data loading, and determines which data can be read out from a certain way in response to the offset field during data reading.
For the convenience of explanation, it is assumed that the address is 16 bits, the offset field 2 bits, the tag field 14 bits, the data bus 8 bits and there are 4 ways.
FIG. 2 is an internal block diagram of a conventional 4-way cache used only as cache function, which includes 4 ways (way_0, way_1, way_2 and way_3), an OR gate 10 and a multiplexer 11. Each of 4 ways receives corresponding tag field and corresponding offset field of the address bus, respectively. The OR gate 10 performs a logic OR operation on output signals applied from four ways, so as to generate a cache hit signal when any one of four ways allows a way hit to occur. The multiplexer 11 selectively transfers one of data applied from four ways, in response to way hit signals applied from four ways.
FIG. 3 is an internal block diagram of one of 4 ways shown in FIG. 2. The way includes a tag register 30, a comparator 31, a tag hit latch 32, four 8-bit data registers 33, a multiplexer 34, four 1-bit valid bit registers 35, a multiplexer 36, an AND gate 37. The tag register 30 stores 14-bit tag field applied through the address bus when the data is loaded into the cache. The comparator 31 compares the tag value stored in the tag register 30 with the tag value of the address inputted through the address bus for reading out from the cache. The tag hit latch 32 stores the compared result signal from the comparator 31 as a tag hit signal. Each of four 8-bit data registers 33 stores the data loaded from the main memory into the cache. The multiplexer 34 selectively outputs one of the data stored in four 8-bit data registers 33,in response to the 2-bit offset field of the address inputted through the address bus. Each of four 1-bit valid bit registers 35 stores corresponding valid bit indicating whether the data stored in the corresponding one of four data registers 33 is valid or not. The multiplexer 36 selectively outputs one of the content of four 1-bit valid bit registers 35, in response to the 2-bit offset field of the address inputted through the address bus. The AND gate performs an logic AND operation on the tag hit signal from the tag hit latch 32 and the valid bit from the multiplexer 36, so as to produce the result as a way hit signal.
The operation of the 4-way cache as configured above will be explained.
First, the comparator in the respective ways compares the value of tag field of the address bus applied for reading out from the cache with the content of the tag register 30. When the compared result indicates that the value of the tag field is the same as the value stored in the tag register 30, `tag hit signal` is generated so that the tag hit latch 32 is set to "1", and the multiplexer 34 selectively produces one among four data registers 33 to the data bus, in response to the 2-bit offset field of the inputted address bus. Also, the multiplexer 36 selects and produces the corresponding valid bit to the selected data register as an offset field. Here, when the tag hit signal stored in the tag hit latch 32 is "1" and the selected valid bit from the multiplexer 36 is "1", the way hit signal becomes "1".
Next, a cache hit signal is produced, based on respective way hit signals generated as above, and the data outputted from the way generating the way hit signal is finally transferred to the data bus for the microprocessor.
The capacity of cache becomes larger, and the memory must be required in order to store the data for the control of microprocessor and micro-controller. Considering this point, when a microprocessor embedded with large amount of cache is used as a CPU (Central Process Unit) of a computer system having a large amount of memory; the internal cache memory is used for cache-unique purpose. On the other hand, when the microprocessor embedded with cache is used for system control that requires relatively small amount of memory but high speed processing, the internal cache is used as a main memory, thereby increasing the processing speed of the system and maximizing the utility of the cache.