A significant barrier to improving the performance of a microprocessor system is the access time of system memory. Although the speed of semiconductor memories has improved over time, the speed of DRAM devices has not kept pace with the speed of the processors. Consequently, when executing most applications, a processor will experience numerous wait states while system memory is accessed. A frequently employed solution to this problem is the incorporation in the microprocessor system of a high-speed cache memory comprising SRAM devices. In general, a cached system will experience significantly fewer wait states than a non-cached system.
The simplest form of cache is generally referred to as a direct-mapped cache, wherein contents of the system memory are retrieved and stored in cache locations having the same low-order address. For example, if an 8K cache is provided, the thirteen lowest order address bits of the system memory location to be retrieved define the cache storage location. A significant disadvantage of a direct-mapped cache is that the cache contents will be overwritten whenever there is an access request to a system memory location having the same low order address but a different high order address.
To overcome this disadvantage, a set associative cache structure is sometimes used. For example, with a two-way set associative cache, the cache memory is physically divided into two banks of SRAMs. Thus, a two-way set associative 8K cache would comprise two 4K banks of SRAM. Data retrieved from system memory may be mapped into either one of the two banks since the two banks have identical low order addresses. A cache hit in one bank causes a least recently used (LRU) flag to be set for the corresponding address in the other bank. Thus, cache writes may be directed to the cache bank whose contents were least recently used, thereby preserving the more recently used data for subsequent accesses by the CPU. An associative cache significantly improves the cache hit rate and thus improves overall system performance.
Additional banks of SRAM may be added to create a four-way, eight-way, etc., associative cache. However, the increase in system performance with increased associativity is non-linear and it is generally felt that four-way associativity provides an optimal performance/cost tradeoff. Prior art cached systems incur significantly higher power consumption as the cache associativity is increased. Although total cache memory remains constant, a four-way associative cache consumes significantly more power than a direct-mapped cache since the power consumption of each SRAM device is not proportional to the size of the SRAM array. Furthermore, a four-way associative cache will require four times as many SRAM packages as a direct-mapped cache, thereby occupying more area on the processor circuit board.
One of the objects of the present invention is to implement an associative cache using a single bank of SRAM, thereby achieving the superior hit rate performance of an associative cache without incurring the component cost, power consumption and real estate penalties of prior art associative cache subsystems.