Memory management, i.e., the operations that occur in managing the data stored in a computer, is often a key factor in overall system performance for a computer. Among other tasks, memory management oversees the retrieval and storage of data on a computer, as well as manages certain security tasks for a computer by imposing restrictions on what users and computer programs are permitted to access.
Modern computers typically rely on a memory management technique known as virtual memory management to increase performance and provide greater flexibility in computers and the underlying architectural designs upon which they are premised.
With a virtual memory system, the underlying hardware implementing the memory system of a computer is effectively hidden from the software of the computer. A relatively large virtual memory space, e.g., 64-bits or more in width, is defined for such a computer, with computer programs that execute on the computer accessing the memory system using virtual addresses pointing to locations in the virtual memory space. The physical memory devices in the computer, however, are accessed via “real” addresses that map directly into specific memory locations in the physical memory devices. Hardware and/or software in the computer are provided to perform “address translation” to map the real memory addresses of the physical memory to virtual addresses in the virtual memory space. As such, whenever a computer program on a computer attempts to access memory using a virtual address, the computer automatically translates the virtual address into a corresponding real address so that the access can be made to the appropriate location in the appropriate physical device mapped to the virtual address.
One feature of virtual addressing it that is not necessary for a computer to include storage for the entire virtual memory space in the physical memory devices in the computer's main memory. Instead, lower levels of storage, such as disk drives and other mass storage devices, may be used as supplemental storage, with memory addresses grouped into “pages” that are swapped between the main memory and supplemental storage as needed.
In addition, some computer designs also include the concept of segmentation, which partitions the virtual memory into different segments (each mapped to blocks of pages) in order to facilitate memory protection, simplify the handling of large and growing data structures, and otherwise provide greater flexibility for performing memory management when multiple processes are capable of being handled in a computer at any given time. When segmentation is used, an additional layer of indirection is used, requiring an additional translation to be performed. Typically, in systems incorporating segmentation and paging, computer programs access the memory system using “effective” addresses that map to virtual addresses, thus requiring a translation first from effective to virtual address, then from virtual to real address.
Due to the frequency of access requests in a computer, address translation can have a significant impact on overall system performance. As such, it is desirable to minimize the processing overhead associated with the critical timing path within which address translation is performed.
Address translation in a virtual memory system typically incorporates accessing various address translation data structures. One such structure, referred to as a page table, includes multiple entries that map virtual addresses to real addresses on a page-by-page basis. Likewise, for handling segmentation, a segment table is often provided, including entries that map effective addresses to virtual addresses on a segment-by-segment basis.
Often, due to the large number of memory accesses that constantly occur in a computer, the number of entries required to map all of the memory address space in use by a computer can be significant, and require the entries to be stored in main storage, rather than in dedicated memory, which makes accessing such entries prohibitively slow. To accelerate address translation with such a scheme, high speed memories referred to as translation lookaside buffers (TLB's) and segment lookaside buffers (SLB's) are typically used to cache recently-used entries for quick access by the computer. If a required entry is not stored in a TLB or SLB, a performance penalty is incurred in loading the entry from main storage; however, typically the hit rate on TLB's and SLB's is exceptionally high, and the penalty associated with loading entries from main storage is more than offset by the performance gains when entries are immediately accessible from the TLB and SLB.
In still other designs, an additional level of caching may be used to further accelerate performance, by providing an effective to real address translation (ERAT) table that includes entries providing direct mappings between effective and real addresses. Thus, an ERAT table effectively includes information from both the SLB and the TLB to eliminate the need to perform two levels of translation. In some designs, separate data and instruction ERAT tables are respectively provided in close proximity to the instruction and data processing logic in a processor to minimize the effects of address translation on the critical performance paths in the processor.
Originally, paging was based on fixed page sizes, e.g., 4K or 4096 addressable locations per page. With the use of segmentation, however, different page sizes may be supported in different segments. Smaller page sizes are often optimal for efficient use of a memory system, particularly when many processes are running concurrently in a computer. However, as the memory requirements of computers and the programs running thereon continue to increase, the number of pages of memory required by any given process or program continues to increase, and as a result, larger page sizes may be more efficient for many situations.
Some conventional address translation schemes have handled larger page sizes by allocating multiple entries in the TLB and page table for each large page, e.g., for a 16K page in a system that supports a minimum page size of 4K, four (16K/4K) entries may be used. However, for larger pages, the number of entries required to represent such pages can effectively reduce the capacity of TLB's and ERAT's, and thus lead to higher miss rates and lower performance.
Other designs allocate a single page table entry to each page regardless of size, and typically provide in the entry, or in the segment information for the segment within which the associated page is resident, an indication of the page size for that entry.
Multiple page sizes complicate address translation predominantly due to the different allocation of bits in effective addresses directed to different page sizes. In particular, addresses are often partitioned for the purposes of address translation into offset bits and index bits, with the offset bits pointing to a specific address in a page. For a 4K page, 12 offset bits are required to address every location in a page, while for a 16K page, 14 offset bits are required. The index bits, which are typically the higher order bits from the offset bits, are then used to identify the page, and thus, the index bits are used to access address translation data structures such as the ERAT's. When multiple page sizes are supported, however, the size of a page must be known before the appropriate translation data structure can be accessed, so the proper bits can be used as the index into the structure.
As a result, conventional designs have often required that a lookup be performed to determine the page size for a given address prior to accessing a translation data structure such as an ERAT, typically by accessing the SLB. By doing so, however, an additional step is added to the critical path for address translation, and thus, the lookup has an adverse impact on performance.
Other designs have attempted to address the complications that arise from multiple page sizes, e.g., by using fully associative translation data structures, using separate translation data structures per page size, or using skewed-associative translation data structures. Fully associative translation data structures, however, are known to be costly in terms of size, speed and power. Separate translation data structures for each page size would also increase the critical path, and raise a concern as to efficiency in applications where one page size predominates. Likewise, skewed-associative translation data structures would also likely raise a concern as to efficiency when one page size predominates.
Therefore, a significant need continues to exist for a manner of efficiently and cost effectively supporting multiple page sizes in a virtual memory system with minimal impact on performance.