1. Technical Field
The present invention relates in general to data processing and, in particular, to address translation in a data processing system employing memory virtualization.
2. Description of the Related Art
A computer system typically includes one or more processors coupled to a hierarchical data storage system. The computer system's hierarchy of data storage devices often comprises processor registers, cache memory, and system memory (e.g., SRAM or DRAM), as well as additional data storage devices such as hard disks, optical media, and/or magnetic tapes.
Regardless of the computer system architecture that is employed, it is typical that each processor accesses data residing in memory-mapped storage locations (whether in physical system memory, cache memory or another system resource) by utilizing real (or physical) addresses to identify the storage locations of interest. An important characteristic of real (or physical) addresses is that there is a unique real address for each memory-mapped physical storage location.
Because the one-to-one correspondence between memory-mapped physical storage locations and real addresses necessarily limits the number of storage locations that can be referenced to 2N, where N is the number of bits in the real address, the processors of most commercial computer systems employ memory virtualization to enlarge the number of addressable locations. In fact, the size of the virtual memory address space can be orders of magnitude greater than the size of the real address space. Thus, in a conventional systems, processors internally reference memory locations by the effective addresses and then perform effective-to-real address translations (often via one or more virtual address spaces) to access the physical memory locations identified by the real addresses.
In a virtual memory system, a page frame (and/or block) table is commonly maintained at least partially in system memory in order to track the mapping between the logical address space(s) and the physical address space. A typical entry in a page or block table includes a valid bit, which indicates whether the page/block is currently resident in system memory, a dirty bit, which indicates whether a program has modified the block, protection bits, which control access to the page/block, and a real page/block number (i.e., the physical address) for the page/block of virtual memory, if the page/block is resident in system memory.
To minimize the latency of address translation, processors typically contain a number of address translation data structures that cache address translations for recently accessed memory pages. For example, an exemplary computer system employing two-level translation from effective addresses to virtual addresses to real addresses may include data and instruction effective-to-real address translation (ERAT) tables that buffer only the most recent translations to facilitate direct effective-to-real address translation, a software-managed segment lookaside buffer (SLB) that buffers recently used effective-to-virtual address translations, and a hardware-managed translation lookaside buffer (TLB) that buffers recently used virtual-to-real address translations. In addition, some virtual memory systems provide an additional address translation buffer called a block address translation (BAT) buffer, which serves as a TLB for variable sized memory blocks.
In operation, when a processor generates the effective address of a memory access, the processor performs an ERAT lookup. If the effective address hits in the ERAT, the real address can be obtained relatively quickly. However, if the effective address misses in the ERAT, the SLB and TLB or BAT are accessed to perform a full effective-to-virtual-to-real address translation. If a miss occurs at this second level of address translation, the translation hardware invokes a page table walk engine to access the required translation entry from cache or system memory. Once the real address is obtained, the memory access is performed in cache memory or system memory.
As real memory capacities, program footprints, and user working sets continue to grow, it is beneficial to increase the coverage of translation information buffered in a processor. Common approaches to increasing the translation coverage include increasing the number of ERAT, SLB and TLB entries and supporting larger memory pages. For example, in addition to conventional 4 kilobyte (4 KB) and 16 KB pages, many systems now additionally support page sizes of 1 megabyte (MB), 16 MB, and 16 gigabyte (GB). However, increasing the number of ERAT, SLB, and TLB entries becomes expensive, both in terms of chip area, power dissipation, and the latency to perform a search for a matching translation entry in a large translation data structure. In addition, use of multiple memory page sizes and providing support for larger page sizes injects additional complexity into processor designs and can cause increased memory fragmentation.