The background description provided herein is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent the work is described in this background section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the present disclosure.
In the description that follows, “byte” refers to an octet, that is, to 8 binary digits (bits). The term “kilobyte” and the abbreviation “KB” both refer to 1024 bytes.
In a computer processor, a cache memory circuit (hereinafter, a cache) may be used to store data corresponding to a location in a main memory. The cache is typically smaller, has lower latency, and has higher bandwidth than the main memory.
The cache may include a plurality of tags respectively associated with a plurality of memory elements that store data (hereinafter, cache lines). The cache compares an address of a memory operation with an address stored in a tag to determine whether the cache line associated with the tag corresponds to the location indicated by the address of the memory operation, that is, whether the location indicated by the address of the memory operation is cached.
A set-associative cache may only check one of a plurality of sets of the cache when determining whether a location is cached. The set-associative cache determines which set to check using bits of the address of the memory operation. For example, in a set-associative cache having 256-byte cache lines and 256 sets, the set to check may be determined using bits 15 through 8 of the address of the memory location (with bits 7 through 0 indicating particular bytes within the 256-byte cache lines). In this example, bits 15 to 8 correspond to the set address.
A set-associate cache may have a plurality of ways. The number of ways indicates the number of distinct locations in the cache that may correspond to any one memory location.
Caches may be used in processors having virtual memory architectures. In a virtual memory architecture, virtual addresses are generated by the processor and are then translated by a Memory Management Unit (MMU) into physical addresses. In a typical MMU, memory addresses are translated in pages. For example, in an MMU using 4 KB pages, each 4 KB page in the virtual memory space may be mapped to a 4 KB page in the physical address space. A location at an offset within the virtual memory page will be located at the same offset within the corresponding physical address page.
To reduce a latency of load operations, the cache may begin a process of retrieving data before the physical address of the data is fully known. In particular, a virtually indexed, physically tagged (VIPT) cache may begin the process of retrieving data before the MMU completes an address translation between a virtual address and a physical address.
The VIPT cache is a set-associative cache that uses a plurality of bits of the virtual address as the set address, that is, the VIPT cache uses a virtual set address (VSA) to index the cache. Once the VSA has been determined, the VIPT cache compares a plurality of bits of the physical address against the tags in the set corresponding to the VSA to determine whether the VIPT cache includes a cache line corresponding to the memory location specified by the physical address.
When the VIPT cache includes a plurality of ways, in each way, a tag corresponding to the VSA is checked for the corresponding cache line.
When the VSA includes only address bits that are invariant in the address translation (hereinafter, invariant bits), the set address will always identify the correct location because any given value of the physical address will always be associated with a same value for the VSA. For example, when the MMU uses 4 KB pages, bits 11 to 0 of the virtual address indicate the offset within the page and are therefore not altered by the address translation.
Indexing the VIPT cache using only invariant bits can reduce flexibility in the design of the cache. When the VIPT cache is indexed using the invariant bits, the number of sets may be limited by the number of invariant bits, and increasing the size of the cache may require adding more ways to the cache instead of increasing the number of sets.
When the set address includes bits other than invariant bits, cache aliasing may occur in the VIPT cache. Cache aliasing may occur when both (i) a first virtual address and a second virtual address are each translated to a same physical address, and (ii) a first VSA generated using the first virtual address has a different value than a second VSA generated using of the second virtual address.
For example, the first virtual address may produce a first VSA of 0, and the second virtual address may produce a second VSA of 128. Depending on whether and which virtual address caused a cache line to be allocated to the physical address, the cache line corresponding to the physical address may be a first cache line within set 0 of the VIPT cache, a second cache line within set 128 of the VIPT cache, both the first cache line and the second cache line, or not in the VIPT cache at all.
Therefore, if the VIPT cache checks only one of set 0 and set 128 for the corresponding cache line, and the cache line is not in the checked set but is in another set because of cache aliasing, the VIPT cache may erroneously determine that the corresponding cache line is not in the VIPT cache. This erroneous determination can produce one or more of data errors, performance degradation, and increased power consumption.
Therefore, when cache aliasing occurs, the VIPT cache should quickly and efficiently (i) detect whether a cache line corresponding to the physical memory address is present in the VIPT cache, and (ii) identify the corresponding cache line.