Virtual computing environments allow multiple virtual machines (VMs) to run on a single physical platform and to share physical resources. Some virtual computing environments enable configuration of VMs such that the total amount of memory designated for use by the VMs is larger than the actual amount of memory available on the host. Referred to as memory over-commitment, this feature enables a single physical platform (also referred to herein as a “host”) to support the simultaneous execution of more VMs. Some virtual computing environments permit arbitrary boundaries to be placed around computing resources, such as memory, that may span more than one host. For example, a virtual computing environment may create two VMs, each configured with 4 GB of memory, from a resource pool potentially spanning multiple hosts and having a memory limit of less than the 8 GB required (e.g., a 7 GB memory limit.)
Consolidation of computing systems, which leads to the opportunity to over-commit computing resources, such as memory, is one of the key benefits of virtualization. To achieve over-commitment, the virtual infrastructure gives a VM less memory than what the guest operating system (OS) in the VM believes it has. This can be done by using a technique known as ballooning, which is described in U.S. Pat. No. 7,433,951, the entire contents of which are incorporated by reference herein. A balloon is a resource reservation application that runs as a guest application in the VM or as driver in the guest OS and that requests guest physical memory from the guest OS. After the guest OS has allocated guest physical memory for use by the balloon application, the balloon application is able to ultimately communicate information regarding the allocated guest physical memory to a hypervisor that supports the VM, which is then able to repurpose machine memory backing the guest “physical” memory allocated to the balloon application. That is, since the balloon application only reserves guest “physical” memory but does not actually use it, the hypervisor can, for example, repurpose machine memory that backs such allocated guest “physical” memory for use by another VM without fear that the balloon application would write to the guest “physical” memory (and therefore the backed machine memory).
Another technique for memory management is called hypervisor swapping. In this technique, the virtual infrastructure transparently unmaps (i.e., takes away) machine memory pages from the guest OS, swaps the content of guest “physical” memory pages to disk, and frees up machine memory for other VMs. The virtual infrastructure swaps the contents back into machine memory when the guest OS needs to access these guest “physical” memory pages. Both ballooning and hypervisor swapping may impact the performance of applications inside the guest, because there is less machine memory allocated to the guest. However, as long as the total working set of applications running in the guest is at least as large as the guest's machine memory allocation, the application may not suffer significant performance loss.
Unfortunately, there are applications and runtimes that do not work well with memory over-commitment. The Java Virtual Machine (JVM) is one of the most widely used runtimes in this category. In “cloud” environments that provide dynamic allocations of server resources, it has become increasingly popular to offer Java services by deploying JVMs running in VMs sharing resources with a physical host. It may be common for some of these JVMs to endure periods of inactivity, though the JVMs continue to consume valuable resources, such as memory. When the virtual infrastructure is under memory pressure, the virtual infrastructure may transfer this memory pressure to the VMs using the ballooning technique described above to reclaim machine memory that VMs and JVMs executing thereon may no longer be using. However, in the case of a VM running an idle JVM, additional memory pressure on the VM is likely to cause guest OS to instead page out guest “physical” memory pages relied upon by the idle JVM. In this case, when the idle JVM is later needed, for example, for processing server requests, a significant performance cost may be incurred by having to page in guest “physical” memory pages for the JVM from the guest's virtual disk device prior to resuming execution of the JVM. Similarly, the hypervisor swapping technique described above may incur a similar performance cost.