1. Field of Invention
The present invention relates in general to the digital data processing field and, in particular, to loading entries into a translation lookaside buffer (TLB) in hardware via indirect TLB entries, which are loaded into the TLB either by software on demand or by a hardware mechanism that utilizes a hash table in memory.
2. Background Art
In the latter half of the twentieth century, there began a phenomenon known as the information revolution. While the information revolution is a historical development broader in scope than any one event or machine, no single device has come to represent the information revolution more than the digital electronic computer. The development of computer systems has surely been a revolution. Each year, computer systems grow faster, store more data, and provide more applications to their users.
A modern computer system typically comprises at least one central processing unit (CPU) and supporting hardware, such as communications buses and memory, necessary to store, retrieve and transfer information. It also includes hardware necessary to communicate with the outside world, such as input/output controllers or storage controllers, and devices attached thereto such as keyboards, monitors, tape drives, disk drives, communication lines coupled to a network, etc. The CPU or CPUs are the heart of the system. They execute the instructions which comprise a computer program and direct the operation of the other system components.
The overall speed of a computer system is typically improved by increasing parallelism, and specifically, by employing multiple CPUs (also referred to as processors). The modest cost of individual processors packaged on integrated circuit chips has made multiprocessor systems practical, although such multiple processors add more layers of complexity to a system.
Computer systems typically utilize virtual addressing mechanisms that allow the programs of a computer system to behave as if they have access to a large, single storage entity instead of access to multiple, smaller storage entities such as a main memory and a DASD device. Virtual addressing mechanisms are typically accomplished by providing memory management units (MMUs) that translate virtual memory addresses to physical memory addresses (also referred to herein as “real addresses”).
A particular physical address may be in main memory or in long-term memory, such as a DASD device. If the physical address of information (e.g., data or instructions) sought is in main memory, the information is accessed and utilized by the computer system. If the physical address of information (e.g., data or instructions) sought is in long-term memory, the information is transferred from the long-term memory (usually in a block referred to as a “page”) to main memory where it may be used. This transfer is accomplished under control of the MMU.
In computer systems that utilize virtual addressing mechanisms, the speed at which memory may be accessed depends to a significant extent upon the process required to translate addresses form virtual to physical and then retrieve the information from memory. A basic virtual addressing mechanism creates lookup tables which are stored in main memory. Any virtual address presented to the MMU is compared to the values stored in these tables to determine the physical address to access. There are often several levels of tables, and the comparison (typically involving “walking down” a page table directory structure) takes a great deal of system clock time.
Typically, translation lookaside buffers (TLBs) are utilized to enhance the basic virtual addressing mechanism described above. A TLB is essentially a cache of page table entries mapping virtual addresses to physical addresses. With each memory access, the TLB is presented with a virtual address. If the address hits in the TLB, virtual address translation adds little or no overhead to the memory access. If the address misses in the TLB, a more costly hardware handler or software handler is invoked to load and insert the required page table entry into the TLB so the address will hit in the TLB and the memory access can proceed.
Embedded processors with software loaded TLBs can have poor performance on some workloads. Responsible for this poor performance is the overhead of resolving in software the virtual address translations that aren't cached in the TLB. This is generally why higher end processors provide a hardware mechanism to load translations in the TLB automatically. Such hardware mechanisms, however, tend to be complex and expensive. There are several conventional approaches to hardware loading of virtual address translations. These conventional approaches include: tree structured page tables; hashed page tables; virtual linear page tables; page table pointer caches; and TLBs with both page table pointers and page table entries. Each of these approaches is discussed briefly below.
The tree structured page tables approach uses a tree structure in memory. The root of the tree is identified by a physical address in memory, and bits from the virtual address are used as an index at each level of the tree. One of the drawbacks of this approach is that in order to map a large address space, several levels of tree are necessary (typically at least four levels for a 64-bit processor). Another drawback of this approach is that unless caching is employed, resolving a translation requires one load from memory for each level of the tree. This approach can perform poorly and require complex hardware. Also, the memory required for page tables can be excessive in some situations.
Tree structured page tables can provide good performance on workloads where the access pattern has a high degree of locality, provided that caching and/or prefetching is/are implemented. However, tree structured page tables are generally not suitable for an embedded processor as they require a lot of caching in order to perform well, and the logic required to traverse the tree structure is relatively complex.
Another conventional approach to hardware loading of virtual address translations into TLBs utilizes hashed page tables. This approach has several drawbacks. For example, this approach requires a lot of memory for the hash tables, exhibits generally fairly poor locality and cache behavior, and requires added software complexity to manage the hash tables. Individual entries in the hash tables tend to be larger as they need to replicate part of the virtual address. In the case of PowerPC processors (available from International Business Machines Corporation, Armonk, N.Y.), the type of hash table defined by the architecture requires an additional entity to provide segment information (i.e., a segment table or a segment lookaside buffer) which is expensive and timing sensitive, and thus not suitable for most embedded processor implementations.
Yet another conventional approach to hardware loading of virtual address translations into TLBs utilizes virtual linear page tables. While more suitable than the previous two approaches (i.e., tree structured page tables and hashed page tables) for embedded processors, virtual linear page tables occupy part of the available virtual address space. This can be a serious limitation on 32-bit implementations, for example. While the locality performance is good, this approach tends to fall into pathological scenarios when manipulating large virtual address spaces with random access patterns.
Still another conventional approach to hardware loading of virtual address translations into TLBs utilizes page table pointer caches. This approach is exemplified in Michael Wu and Willy Zwaenepoel, “Improving TLB Miss Handling with Page Table Pointer Caches”, Dec. 12, 1997. In accordance with this approach, pointers to page tables are cached in a separate array in the MMU.
Yet still another conventional approach to hardware loading of virtual address translations into TLBs utilizes TLBs with both page table pointers and page table entries. This approach is exemplified by U.S. Pat. No. 5,426,750, issued Jun. 20, 1995 to Robert Becker et al., and entitled “TRANSLATION LOOKASIDE BUFFER APPARATUS AND METHOD WITH INPUT/OUTPUT ENTRIES, PAGE TABLE ENTRIES AND PAGE TABLE POINTERS”. In accordance with this approach, pointers to page tables are cached in the TLB, but page table pointers are indexed by their physical or real address, as part of a normal top down page-table walk.
Therefore, a need exists for an enhanced mechanism for providing data protection for loading TLB entries into a TLB in hardware via indirect TLB entries.