A cache is a very fast local storage memory that is used by a processor that typically resides between the processor and main system memory. The cache decreases the latency to the slower main system memory by holding copies of code and data that are frequently requested from the main system memory by the processor. A cache may reside within the processor itself, outside the processor, or both inside and outside the processor.
A processor may use memory management mechanisms such as segmentation and paging, as are well-known in the prior art. Paging, for example, allows data to be referenced within a large continuous address space, or virtual address space. The virtual address space may be much larger than the actual memory space of the system in which the processor resides. Paging hardware translates the virtual address to a physical address. A physical address is said to be aliased when it corresponds to more than one virtual address.
A cache can be fully associative, set associative, or direct mapped. In a fully associative cache, each item of information from main system memory is stored as a unique cache entry. There is no relationship between the location of the information in the cache memory array and its original location in system memory. If there are N storage locations in the cache, the cache can store the last N main system memory locations accessed by the processor.
In a set associative cache, the cache is divided into banks of memory, or "ways". A 2-way set associative cache divides the cache into 2 ways, a 4-way set associative cache into 4 ways, and so on. For example, a 64 Kilobyte (64K) 2-way set associative cache is organized as two 32K ways (64K/2-way=32K). Each location from the system memory can map only to one location in a cache way.
A direct mapped cache uses the entire data cache as one bank of memory or way. The cache sees main system memory as logically broken up into pages, each page the size of the cache. For example, a 64K direct mapped cache would logically see main system memory as a collection of 64K pages. Each location in any main system memory page directly maps to only one location in the cache.
A "physically-indexed" cache is addressed only with address bits that do not require virtual-to-physical address translation. Such an organization is highly desirable since it avoids address aliasing problems and can be optimized to allow the cache access and the virtual to physical address translation to proceed in parallel. The virtual-to-physical address translation is performed by a Translation Look-aside Buffer (TLB).
Physically-indexed cache organizations are constrained by the minimum system page size, cache capacity and associativity. For example, if the minimum system page size is 4K, physically-indexed cache organizations are limited to: 4K with a set associativity of 1 (direct mapped), 8K with a set associativity of 2, 16K with a set associativity of 4 and so on. That is, each cache "way" can only be as large as the minimum page size and increases in the cache capacity must increase the associativity to maintain the physical indexing property. For a large cache to be physically indexed and support small page sizes, it must have a large number of ways. However, high associativity caches result in timing penalties, large power dissipations and high area costs. For these reasons, typical cache organizations limit the associativities to 2, 4, and sometimes 8.
A cache is "virtually-indexed" if at least one of the address bits used to access the cache may be different from corresponding physical address bits. Consider a 64K cache with a 4-way set associativity and a page size of 4K. To access the cache, the virtual address bits VA 13:0! are used. Because the page size is 4K, only the physical translation for VA11:0! are known at cache access time. Since, through translation, multiple virtual addresses can be mapped to the same physical address, i.e., two or more virtual addresses are aliased to the same physical location, it is possible for reads and writes to access the same physical data through different virtual addresses (possibly at the same time). In this case, to guarantee that all references to the same physical address get the right data, a "virtually-indexed" cache maintains a separate mechanism to locate all virtual aliases. This mechanism is often costly in terms of complexity, area and power and is usually a deterrent to implementing "virtually-indexed" caches.
The aliasing problem described above can be simply solved by first translating VA13:12! before the cache is accessed, thus allowing the cache to be physically-indexed. However, this requires the cache access to be delayed until the address translation is available. This increase in cache access latency is a highly undesirable condition.