This invention relates to operating systems and to virtual memory management techniques within operating systems.
Many modern computer systems run multiple concurrent tasks or processes, each with its own address space. It would be expensive to dedicate a full complement of memory to each task, especially since many tasks use only a small part of their address spaces. Rather, virtual memory is used to give each process the appearance of a full address space. This allows a program to run on what appears to be a large, contiguous, physical-memory address space, dedicated entirely to the program. In reality, however, the available physical memory in a virtual memory system is shared between multiple programs or processes. The memory that appears to be large and contiguous is actually smaller and fragmented between multiple programs. Each program accesses memory through virtual addresses, which are translated by special hardware or software to physical addresses.
Rather than attempting to maintain a mapping for each possible virtual address, virtual memory systems divide virtual and physical memory into blocks. In many systems, these blocks are fixed in size and referred to as pages. The addresses within an individual page all have identical upper-most bits. Thus, a memory address is the concatenation of a page number, corresponding to the upper bits of the address, and a page offset, corresponding to the lower bits of the address.
A data structure is typically maintained in physical memory to translate from virtual page numbers to physical page frames. This data structure usually takes the form of a page table. A page table is indexed by virtual page number, and it generally has a number of entries equal to the number of pages in the virtual address space.
Virtual-to-physical address translation can consume significant overhead, since every data access requires first accessing the page table to obtain a physical address and then accessing the data itself. To reduce address translation time, computers use a specialized hardware cache dedicated to translations. The cache is referred to as a translation lookaside buffer (TLB). A TLB typically has a small, fixed number of entries. It can be direct-mapped, set associative, or fully associative.
FIG. 1 shows a prior art example of a virtual memory system using a TLB and a page table. A process generates a virtual address 12 comprising a virtual page number and a page offset. The page number portion of the virtual address is used to index a TLB 14 Assuming that the TLB contains an entry corresponding to the virtual page number (a situation referred to as a TLB xe2x80x9chitxe2x80x9d), the TLB produces a physical page number. The page offset portion of virtual address 12 is concatenated with the physical page number from the TLB, resulting in a full physical address for accessing physical memory 16. If the correct entry is not present in TLB 14 (a situation referred to as a TLB xe2x80x9cmissxe2x80x9d), an initial reference is made to page table 18 to update TLB 14.
A TLB miss thus initiates its own memory reference that could in many cases be the source of another TLB miss, creating the potential for an endless loop. To prevent this, the page table is typically stored in an xe2x80x9cunmappedxe2x80x9d portion of physical memory that is addressed directly by its physical addresses rather than by virtual addresses.
One disadvantage of this approach is that the page tables can consume large amounts of physical memory. Typically, each program requires its own page table, and each table has entries for every possible virtual page number. The page tables cannot typically be paged to secondary storage.
The requirement of locating the page tables in unmapped memory is a further disadvantage in virtual memory systems such as those described above. It is typically quite difficult or at least very inefficient to deal with dynamically allocated data structures in unmapped memory.
FIG. 2 shows an alternative prior art architecture that is described in Rashid, Machine-Independent Virtual Memory Management for Paged Uniprocessor and Multiprocessor Architectures, Proceedings of the Second International Conference on Architectural Support for Programming Languages and Operating Systems, Oct. 5-8, 1987, at 31. This architecture, identified in the article as the Mach architecture, has a microprocessor with a hardware TLB 20 similar or identical to the one described above. A virtual page address 22 is formed by the combination of a virtual page number and a page offset. The virtual page number is translated by TLB 20 into a physical page number and combined with the page offset to address physical memory 24.
The Mach system does not have page tables as they are commonly known. Rather, TLB misses are resolved by a combination of a xe2x80x9cpmapxe2x80x9d data structure 26 and a plurality of address maps 28. All of these data structures are located in the system""s physical memory. Pmap 26 is located in unmapped portions of the physical memory. Address maps 28 are located in mapped portions of physical memory, and are therefore addressed by virtual addresses.
Pmap 26 is somewhat similar to a single page table. In contrast to a typical page table, however, the pmap does not keep track of all valid virtual-to-physical address mappings. Rather, many entries in pmap 26 can be replaced or flushed to make room for newer entries. Accordingly, the pmap can be much smaller than a page table.
Because the pmap does not keep track of all valid virtual-to-physical address mappings, it may not have a specified desired virtual-to-physical translation. If it does not, reference must be made to an appropriate one of the address maps 28. Generally, each process or task has its own address map, which contains a complete description of that task""s virtual address space. The Rashid article notes that an address map is a xe2x80x9cdoubly linked list of address map entries, each of which maps a contiguous range of virtual addresses onto a contiguous area of a memory object.xe2x80x9d
A system such as shown in FIG. 2 tends to use memory more efficiently than a system such as shown in FIG. 1. One reason for this is that address maps 28 can be structured much more compactly than page tables. While page tables are referenced very often, in response to every TLB miss, address maps 28 are referenced much less frequently, only in response to misses in pmap 26. Therefore, the structure of address maps 28 can sacrifice searching efficiency for compactness. Another advantage of the Mach system is that address maps 28 can be located in mapped memory which is easier to manage and allocate than unmapped memory. In contrast, conventional page tables must usually be located in unmapped memory.
However, some of these advantages are negated by measures that are taken in the Mach system to avoid recursion and deadlock. The possibility of recursion is illustrated in the flow chart of FIG. 3. In response to a TLB miss, reference is made to the pmap in unmapped memory. In the case of a hit, the corresponding physical address is loaded into the TLB from pmap 26, and the miss is resolved without any further activity. In the case of a miss, however, reference is made to an appropriate address map. These steps are illustrated by blocks 30 and 32 of FIG. 3. The search through the address map is performed by kernel routines that are generally located in unmapped memory. However, the address maps are located in mapped memory and accessing them with virtual memory addresses could generate a new TLB miss. A TLB miss on an address map address is illustrated by the curved line from block 32 to block 30. There is a significant chance that the virtual-to-physical address mapping for the address map address will not be present in the pmap, and it will then be necessary to continue back to execution block 32. This circular execution might continue indefinitely.
To avoid this situation, referred to as recursion, certain entries in pmap 26 of the Mach system are considered permanent or xe2x80x9cwired.xe2x80x9d Specifically, any entry for a kernel address, including entries for address map addresses, are flagged. As shown in FIG. 2, pmap 26 has a column for flagging xe2x80x9cwiredxe2x80x9d entries. Flagged entries are considered permanent and cannot be replaced. This ensures that a second TLB miss as illustrated by the curved line of FIG. 3 will be resolved without further reference to an address map. As noted in the Rashid article:
One of the more unusual characteristics of the Mach dependent/independent interface is that the pmap module need not keep track of all currently valid mappings. Virtual-to-physical mappings may be thrown away at almost any time to improve either space or speed efficiency and new mappings need not always be made immediately but can often be lazy-evaluated . . . . All of this can be accomplished because all virtual memory information can be reconstructed at fault time from Mach""s machine independent data structures. The only major exceptions to the rule that pmap maintains only a cache of available mappings are the kernel mappings themselves. These must always be kept complete and accurate.
Id. at 35-36 (emphasis supplied).
The method of xe2x80x9cwiringxe2x80x9d pmap entries also solves a recursion problem related to the use of mutexes for synchronizing access to the address maps. A mutex is xe2x80x9clockxe2x80x9d that must be obtained by an execution thread before the thread can access an associated data structure. The mutex is requested from and granted by the operating system kernel. If the mutex is not available (for instance if it has already been granted to another thread), the execution thread is suspended until the mutex is available. Without the wired entries described above, address map mutexes of the Mach system would result in an independent source of recursive deadlock in translating virtual addresses.
One of the most significant disadvantage of xe2x80x9cwiringxe2x80x9d entries in pmap 26 is that it greatly complicates the structure of pmap 26. The pmap cannot be a true cache, in which any entry can be replaced at any time. It is furthermore not possible to bound the size of the pmap. As kernel mappings are added, the size of the pmap must increase. The resulting data structure complexity, and the variable size of the data structure, is a significant disadvantagexe2x80x94especially when the data structure must reside in unmapped memory.
The system described below reduces the amount of memory consumed by virtual memory translation data structures. This is accomplished by using both a microprocessor""s hardware-implemented TLB, a software-implemented TLB in physical memory, and a plurality of address maps. The software-implemented TLB is a true cachexe2x80x94there are no permanent entries. Any entry in the software-implemented TLB can be flushed at any time. The address maps are linked lists, rather than page tables, and are located in mapped memory wherein they consume significantly less space than would conventional page tables. An exception is that the address map used for kernel address mappings is located in unmapped memory. This prevents problems with recursive misses while eliminating the prior art requirement of permanent or xe2x80x9cwiredxe2x80x9d entries in the software-implemented TLB. Access synchronization is accomplished with two different locking mechanisms associated with each address map: a mutex and a spin lock. Preemptible code portions only need to obtain the mutex. However, non-preemptible code portions need to acquire a spin lock before modifying an address map.