Computer systems have used virtual memory to support multiple parallel processes and enable application code to operate independent of the physical address information stored in a cache. In these systems with virtual memory, an operating system has allocated physical memory addresses to distinct processes. Thus, some form of virtual-to-physical address translation was performed before a physically tagged cache can be accessed by an application. The translation was done dynamically by a Translation Lookaside Buffer (TLB), which contained the associated address mapping maintained by the OS. While the efficiency of higher-level caches has not been substantially impacted during translation, the translation has added latency to a first-level cache access. In the modern processors with clock frequencies in the gigahertz scale, this has resulted in one or more additional cycles to the performance critical load-to-consumer latency.
This performance degradation has been avoided by removing the translation from the critical path and applying a virtually tagged first-level cache, which uses virtual page numbers for the cache look-up instead of the physical page numbers. In these instances, the physical address was only needed to support a snoop mechanism and cache miss handling. The virtual-to-physical address translation can be done in parallel to a cache look-up and does not affect the load-to-consumer latency.
Virtual aliasing, however, has occurred when two or more virtual addresses are mapped to the same physical address. Unless virtual aliasing is checked, multiple copies of the same cache line could coexist in the cache. A write access to one of these copies will make the other copies stale and violate the correctness of the execution. Therefore, more than one modifiable copy of the same cache line must never coexist in the cache.
To prevent this situation from occurring, an operating system or other application had been configured to prevent the mapping of more than one virtual address to a same physical address. This solution, however, is computationally intensive and reduces overall performance.
A second approach involved using a virtual tag array to determine whether an address is included in the cache through a cache hit or miss, and a physical tag array outside the load-to-consumer loop for cache misses and snoop handling. Processors that have applied this second approach have stalled a machine when a memory operation misses or does not match any addresses in the virtual tag array. When the machine is stalled, the physical address was checked against the physical tag array. If there is a physical tag match, a virtually aliased entry may be evicted from and re-filled into the cache or just re-tagged with the new virtual page number. The machine then resumes its previous tasks once both virtual and physical tag entries have been updated. This second approach also adds unnecessary delay and negatively impacts overall system performance.
There is a need for preventing virtual aliasing in caches while reducing the impact on overall system performance.