A processor can access memory by first generating a “virtual address” instead of the actual physical address of the target location in memory. To access the target location, the virtual address can be mapped to the physical addresses. The virtual address to physical address (“VA-to-PA”) mapping can be stored in a cache, and may be dynamically updated under control, for example, of an operating system. Virtual addressing can provide various benefits, for example, dynamic allocation of physical memory space, protection against unauthorized access of secure memory space, and prevention of multiple programs inadvertently using overlapping space in physical memory. Virtual addressing can also, for example, enable non-contiguous physical memory spaces to appear to a software program as a contiguous space.
Conventional virtual addressing techniques can segment physical memory into blocks or “pages,” which may be identified by “page numbers.” The VA-to-PA mapping provides a relationship between a virtual page number and a physical page number, and mapping information is stored in entries of the “page table”. The page tables can be searched by a “page table walk” process, using a sub-set of bits (e.g., “VA tag bits”) of the virtual address. However, there will be processing overhead if a page table walk is required for each memory access. A cache holding a selection of the page entries (e.g., entries most recently used) is one conventional technique for reducing page table walks. The cache, often called a “translation lookaside buffer” or “TLB,” can have N entries. Each TLB entry may include a “tag” that holds “tag bits” which can be searched using the VA tag bits, and can include a physical address, e.g., the physical page number to which the VA tag bits map. When the CPU generates a virtual address, the TLB uses the VA tag bits to search the tag bits in the N entries. If there is a match, the TLB identifies a hit and outputs the physical page number from the matching entry. If there is no match, the TLB identifies a “miss” and a memory management resource can perform a “page table walk” search of the page tables. As mentioned above, the page table walk can consume processing time and memory management resources. Accordingly, low miss rate can be a TLB performance goal.
Other performance goals for a TLB can include fast search speed, e.g., within a clock cycle, and fast TLB invalidation.
However, the different TLB performance goals can create conflicting design goals when applying conventional TLB design techniques. For example, increasing the TLB size, i.e., increasing the number of entries that can be concurrently stored, is one conventional TLB design technique to lower miss rate. Increased TLB size may lead to increased costs, and reduced performance due to a corresponding increase in TLB circuit area and search circuit complexity, and a concomitant increase of propagation delays.
There are other conventional techniques that aim to lower TLB miss rate. For example, one conventional technique uses a “set associative” TLB architecture, which can enable efficient use of chip area. However, set associative TLB architecture can also exhibit a large number of TLB conflicts, and both search and TLB invalidation can consume multiple cycles.