1. Field of the Invention
This invention relates to the field of memory management in computer systems.
2. Description of the Related Art
Most modern computers include at least one form of data storage that has programmable address translation or mapping. In most computers, this storage will be provided by a relatively high-speed system memory, which is usually implemented using solid-state random-access memory (RAM) components.
Although system memory is usually fast, it does have its weaknesses. First, it is usually volatile. Second, for a given amount of data to be stored, system memory takes up more physical space within the computer, is more expensive, and requires more support in terms of cooling, component sockets, etc., than does a conventional non-volatile storage device such as a disk. Thus, whereas many gigabytes of disk storage are commonly included in even computers in the relatively unsophisticated consumer market, such computers seldom come with more than 128 or perhaps 256 megabytes of system RAM.
Because higher speed access to stored data and code usually translates into faster performance, it is generally preferable to run as much of an active application from system memory as possible. Indeed, many applications requiring real-time processing of complex calculations such as voice-recognition software, interactive graphics, etc., will not run properly at all unless a certain amount of RAM is reserved for their use while running.
High-speed system memory is a limited resource and, as with most limited resources, there is often competition for it. This has become an even greater problem in modern multi-tasked systems, in which several applications may be running or, at least resident in memory, at the same time. Even where there is enough memory in a given system for all the applications that need it, it is still often advantageous to conserve memory use: RAM costs money, and consumes both energy and physical space. More efficient management of RAM can reduce the cost, energy, or physical space required to support a given workload. Alternatively, more efficient management of RAM can allow a system to support a larger number of applications with good performance, given a fixed monetary, energy, or physical space budget.
Applications may be defined broadly as any body of code that is loaded and executes substantially as a unit. Applications include, among countless other examples, common consumer programs such as word processors, spreadsheets and games; communications software such as Internet browsers and e-mail programs; software that functions as an aide or interface with the OS itself, such as drivers; server-oriented software and systems such as a web server, a transactional database, and scientific simulations; and even entire software implementations of whole computers, commonly known as “virtual machines” (VMs).
One technique for reducing the amount of system memory required for a given workload, and thereby for effectively “expanding” the amount of available system memory, is to implement a scheme whereby different applications share the memory space. Transparent page sharing, in the context of a multi-processor system on which virtual machines are running, is described in U.S. Pat. No. 6,075,938, Bugnion, et al., “Virtual Machine Monitors for Scalable Multiprocessors,” issued 13 Jun. 2000 (“Bugnion '938”). The basic idea of this system is to save memory by eliminating redundant copies of memory pages, such as those that contain program code or file system buffer cache data. This is especially important for reducing memory overheads associated with running multiple copies of operating systems (e.g., multiple guest operating systems running as virtual machines—see below).
There are two main components to the technique disclosed in Bugnion '938. First, candidate pages that could potentially be shared are identified. Second, the pages are actually shared, when possible, so that redundant copies can be reclaimed.
The approach in Bugnion '938 for identifying pages is to add hooks to the system to observe copies when they are created. For example, a routine within the operating system running within the virtual machine—the virtual operating system VOS—that is used to explicitly copy memory regions is modified to allow copied pages to be shared. Note that the VOS may also be considered to be a “guest” operating system, since the virtual machine, although it is configured as a complete computer system, is actually a software construct that is running on an underlying, physical “host” system.
Another example is Bugnion '938's interposition on disk accesses, which allows disk transfers from a shared non-persistent disk to be shared across multiple guests (virtual machines). In this case, Bugnion '938 tracks disk blocks that are already in main memory, so subsequent requests for the same blocks can be shared. Similarly, support for special devices is added to guests, such as a special virtual subnet that supports large network packets, allowing guests to communicate with each other while avoiding replicated data when possible.
The Bugnion '938 approach for sharing a page is to employ an existing MMU (memory management unit) hardware device to map the shared page read-only for each guest that is sharing it, and to make private copies of the page on demand if a guest attempts to write to it. This technique is known as “copy-on-write” (COW), and is well-known in the literature. In the context of virtual machines, page-sharing can be made transparent to guest, that is, virtual, operating systems, so that they are unaware of the sharing. This is done by exploiting the extra level of indirection in the virtualized memory system between the virtualized guest “physical” memory (which the VM “believes” is the actual hardware memory, but which is actually a software construct) and the actual underlying hardware “machine” memory. In short, multiple guest physical pages can be mapped copy-on-write to the same machine page.
One disadvantage of the page-sharing approach described in Bugnion '938 is that the guest OS must be modified to include the necessary hooks. This limits the use of the Bugnion '938 solution not only to systems where such modifications are possible but also to those users who are willing and knowledgeable enough to perform or at least accept the modifications. Note that such attempted modifications to commodity operating systems may not be possible for those other than the manufacturer of the operating system itself, and then not without greatly increasing the probability that the modifications will lead to “bugs” or instability elsewhere.
Another disadvantage of the Bugnion '938 system is that it will often fail to identify pages that can be shared by different VMs. For example, assume that each VM is using its own persistent virtual disk, that each VM is running a different operating system as the guest OS, for example Windows NT4 and Windows 2000, respectively, and that each is running completely different installations of the software package Microsoft Office 2000. The executable code (for Office 2000) will then be identical for the two VMs, yet the Bugnion '938 system will not identify this. Two complete copies of the same program may then be resident in the system memory at the same time, needlessly taking up many megabytes of memory in order to store the redundant second copy of the program code.
What is needed is a memory management system (and corresponding method of operation) that can be implemented without having to add hooks to the existing guest operating system, and that is able to identify opportunities for page sharing that are not found and exploited by existing memory management techniques. The memory management system should, however, remain transparent to the applications that are using it. This invention provides such a memory management system and related method of operation.