Translation Lookaside Buffers (TLBs), used to translate virtual addresses of a process to physical addresses of memory, are important to virtual memory performance. TLBs often lie on the critical path of a processor's pipeline, and are thereby restricted to operating within a short window of time. Consequently, the number of pages cached in a TLB is also limited to ensure that translation can occur in the short window of time. The amount of memory that can be reached by the number of pages cached, called the TLB reach, is therefore correspondingly limited.
But even as memory device sizes grow and memory consumption per process grows, the short window of time and other constraints of the TLB have not relaxed; thus the TLB reach has not grown at a pace that corresponds to the growth of memory. The limited number of pages cached by the TLB is a performance bottleneck for today's workloads, causing significant degradation of virtual memory performance.
One solution to increase TLB reach is to increase the number of pages cached in the TLB. But adding more pages can add correspondingly more time to a lookup, making it difficult to achieve short pipeline cycle times. Larger TLB caches are also problematic because they can also significantly increase power consumption, expense, and the production of heat.
Another solution is to increase the size of memory addressed by a page, often called superpages, large pages, or huge pages. But superpages foster fragmentation, causing internal fragmentation when only a part of the page is put to use and external fragmentation when holes between superpages are too small to put to effective use.
Another solution is to use a multiplexed TLB, including a direct segment that translates addresses in parallel to a traditional associative cache-only TLB. Pages within the direct segment, defined by a contiguous virtual address range spanning from a base address to a limit address, can be directly translated onto a contiguous physical address range. This increases the TLB reach by an arbitrarily large amount corresponding to the size of the direct segment. But each direct segment belongs to a particular process, and the mapped memory is not accessible for important system usage such as paging.