1. Field of the Invention
The present invention relates generally to a cache memory system and to a method of operating a cache memory system. More particularly, it relates to such a cache memory system and method utilizing a first, small upper level of cache and a second, large lower level of cache. Most especially, it relates to such a cache memory system and method which has the lower latency of a direct-mapped cache on a hit and the lower miss rate of a set-associative cache.
2. Description of the Prior Art
Direct-mapped caches have higher miss rates than set-associative caches but have lower latency than set-associative caches on a hit. Since hits are much more frequent than misses, direct-mapped caches are preferred. The direct-mapped cache has a lower hit rate than a more associative cache because it will have more misses due to accesses which map to the same line in the cache but have different tags. These are called conflict misses. Conflict misses can account for a significant percentage of direct-mapped cache misses.
Two-level cache structures typically have copies of at least some of the data in the first level of cache in the second level of cache. When both the first-level cache and the second-level cache are direct-mapped, mixed, and have the same line size, in conventional systems every cache line in the first-level cache will also be in the second-level cache. In many multiprocessor caching methods, a copy of all data in the first level cache must reside in the second level cache. This is called inclusion.
Two-level cache memory systems in which the two levels are on different integrated circuits have proved to be attractive. For similar reasons, two-level caches on a single integrated circuit are becoming attractive.
Since a significant percentage of direct-mapped cache misses are due to mapping conflicts, it would be nice to "have our cake and eat it too" by somehow providing additional associativity without adding to the critical access path for a direct-mapped cache. The present invention is directed to a technique for achieving this easily, especially in a two-level on-chip cache structure.