A cache memory is a small (compared to the main memory), high speed buffer memory which is located between the central processing unit (CPU) of a modern digital computer and the main memory, and as close to the CPU as possible. The cache memory duplicates and temporarily holds portions of the contents of the main memory which are currently in use or expected to be of use by the CPU. The advantage of the cache memory lies in its access time, which is generally much less than that of main memory, e.g., five to ten times less. The cache memory permits the associated CPU to spend substantially less time waiting for instructions and operands to be fetched and/or stored, permitting a much lower effective memory access time and thereby resulting in an increase in efficiency.
For example, when instructions or data are accessed from the computer main memory, it has been found that these instructions or data have a high likelihood of being required again in the near future. To take advantage of this high re-use probability, the accessed instruction or data is stored in the local, fast cache memory where it can be accessed much faster than from the main memory. The use of a cache memory can give an overall access time much closer to the speed of the cache memory than to the speed of the main memory.
Every time an instruction or data access is requested by the CPU, the cache memory is checked to see if the requested item is in the cache memory. If it is, it is called a cache "hit" and the item is retrieved from the cache memory. If the item is not in the cache memory, it is called a cache "miss" and the item must then be retrieved from the main memory. The instruction or data to be accessed is called the "target." The effectiveness of the cache is measured primarily by the hit ratio, i.e., the fraction of targets which produce a hit, but another significant parameter of the cache memory is the mean time required to access the target when a hit occurs.
In order to increase the number of cache hits, it is known to increase the cache size so that the number of items available in the cache is larger, increasing the likelihood of a hit. Unfortunately, the larger the cache, the more complex the addressing of the cache itself and hence the lower the mean cache memory access time. In addition, larger caches are more costly, requiring larger chip areas, more power and more cooling.
The simplest technique for increasing the hit ratio of cache memories is the direct mapped cache where many main memory locations map into one and only one cache location. Another cache technique to increase the hit ratio is the set associative cache. Here, many locations in the main memory map to a few locations in the cache. Increasing the number of information elements per associative set generally increases the hit ratio. Such increases in associativity, however, produce access delays and, moreover, are costly in the amount of chip area needed and in design complexity. Modern single chip central processing units have limited amounts of space for such cache memories and cache control circuitry. Generally, a large cache with a high hit ratio depends for its effectiveness more on short access times associated with low associativity, while a small cache having a lower hit ratio benefits more from higher levels of associativity.
The value of cache memories depends on the property of "locality" in computer memory accesses. Such locality is temporal as well as spatial. That is, over short periods of time, a computer program generally distributes memory references non-uniformly over the memory address space, concentrating such memory references to relatively small localities in the address space. In addition, memory references tend to remain largely the same for long periods of time. Spatial locality results from the tendency to locate data in contiguous arrays and instructions in sequential locations. Temporal locality results from the tendency to reuse data and instructions, for example in loops.
A major problem, then, is to increase the hit rate of cache memories used with computer central processing units without requiring excessive space to accommodate such cache memories or to accommodate the logic circuitry necessary to control such cache memories. More particularly, it is desirable to achieve the advantages of higher degrees of associativity in a cache buffer without suffering all of the usually associated penalties of complexity and large cache chip areas.