The invention is generally related to computers and computer software, and in particular, to memory address translation.
Memory management, i.e., the operations that occur in managing the data stored in a computer, is often a key factor in overall system performance for a computer. Among other tasks, memory management oversees the retrieval and storage of data on a computer, as well as manages certain security tasks for a computer by imposing restrictions on what users and computer programs are permitted to access.
Modern computers typically rely on a memory management technique known as virtual memory management to increase performance and provide greater flexibility in computers and the underlying architectural designs upon which they are premised.
With a virtual memory system, the underlying hardware implementing the memory system of a computer is effectively hidden from the software of the computer. A relatively large virtual memory space, e.g., 64-bits or more in width, is defined for such a computer, with computer programs that execute on the computer accessing the memory system using virtual addresses pointing to locations in the virtual memory space. The physical memory devices in the computer, however, are accessed via “real” addresses that map directly into specific memory locations in the physical memory devices. Hardware and/or software in the computer are provided to perform “address translation” to map the real memory addresses of the physical memory to virtual addresses in the virtual memory space. As such, whenever a computer program on a computer attempts to access memory using a virtual address, the computer automatically translates the virtual address into a corresponding real address so that the access can be made to the appropriate location in the appropriate physical device mapped to the virtual address.
One feature of virtual addressing it that is not necessary for a computer to include storage for the entire virtual memory space in the physical memory devices in the computer's main memory. Instead, lower levels of storage, such as disk drives and other mass storage devices, may be used as supplemental storage, with memory addresses grouped into “pages” that are swapped between the main memory and supplemental storage as needed.
In addition, some computer designs also include the concept of segmentation, which partitions the virtual memory into different segments (each mapped to blocks of pages) in order to facilitate memory protection, simplify the handling of large and growing data structures, and otherwise provide greater flexibility for performing memory management when multiple processes are capable of being handled in a computer at any given time. When segmentation is used, an additional layer of indirection is used, requiring an additional translation to be performed. Typically, in systems incorporating segmentation and paging, computer programs access the memory system using “effective” addresses that map to virtual addresses, thus requiring a translation first from effective to virtual address, then from virtual to real address.
Due to the frequency of access requests in a computer, address translation can have a significant impact on overall system performance. As such, it is desirable to minimize the processing overhead associated with the critical timing path within which address translation is performed.
Address translation in a virtual memory system typically incorporates accessing various address translation data structures. One such structure, referred to as a page table, includes multiple entries that map virtual addresses to real addresses on a page-by-page basis. Likewise, for handling segmentation, a segment table is often provided, including entries that map effective addresses to virtual addresses on a segment-by-segment basis.
Often, due to the large number of memory accesses that constantly occur in a computer, the number of entries required to map all of the memory address space in use by a computer can be significant, and require the entries to be stored in main storage, rather than in dedicated memory, which makes accessing such entries prohibitively slow. To accelerate address translation with such a scheme, high speed memories referred to as translation lookaside buffers (TLB's) and segment lookaside buffers (SLB's) are typically used to cache recently-used entries for quick access by the computer. If a required entry is not stored in a TLB or SLB, a performance penalty is incurred in loading the entry from main storage; however, typically the hit rate on TLB's and SLB's is exceptionally high, and the penalty associated with loading entries from main storage is more than offset by the performance gains when entries are immediately accessible from the TLB and SLB.
In still other designs, an additional level of caching may be used to further accelerate performance, by providing an effective to real address translation (ERAT) table that includes entries providing direct mappings between effective and real addresses. Thus, an ERAT table effectively includes information from both the SLB and the TLB to eliminate the need to perform two levels of translation. In some designs, separate data and instruction ERAT tables are respectively provided in close proximity to the instruction and data processing logic in a processor to minimize the effects of address translation on the critical performance paths in the processor.
Originally, paging was based on fixed page sizes, e.g., 4K or 4096 addressable locations per page. With the use of segmentation, however, different page sizes may be supported in different segments. Smaller page sizes are often optimal for efficient use of a memory system, particularly when many processes are running concurrently in a computer. However, as the memory requirements of computers and the programs running thereon continue to increase, the number of pages of memory required by any given process or program continues to increase, and as a result, larger page sizes may be more efficient for many situations.
In many designs, large pages, i.e., pages larger than 4K pages, must be allocated at boot time. To avoid this need for pre-allocation, some operating systems implement “transparent” huge pages, which attempts to collect sequential pages and make a translation into a large page. Generally, however, an operating system is unaware of the possibility of allocating large pages, and defaults to making 4K page allocations.