With virtual machine technology, a user can create and run multiple operating environments on a server at the same time. Each operating environment, or virtual machine, requires its own “guest” operating system (OS) and can run software applications independently from the other virtual machines. Virtual machine technology provides many benefits as it can lower information technology (IT) costs through increased efficiency, flexibility and responsiveness. Each virtual machine acts as a separate environment that reduces risks and allows developers to quickly recreate different OS configurations or compare versions of applications designed for different OSs. Additional customer uses for VMs include cloud services, targeted production server consolidation, hosting of legacy applications (older versions), and computer or server backup.
Physical server systems, or hosts, which run multiple virtual machines (VMs) can sometimes face a lack of physical memory (e.g., RAM) due to the common practice known as memory over-commitment in which several VMs are run that have a total amount of memory in their configurations that exceeds the host's actual memory size. When there is little to no free memory left in the host system, the virtualization layer that controls execution of the virtual machines (e.g., hypervisor, virtual machine monitor) needs to free some memory, typically memory allocated to one or more of the running VMs. Otherwise, the host may suffer severe penalties in its performance in the form of stuttering, thrashing, complete lock-up, and so forth.
In some existing approaches, the virtualization layer can attempt to selectively free guest memory of one or more VMs, such as the unused guest memory from an idle VM. However, the virtualization layer has to decide carefully which parts of guest memory to free because guest system performance can be sensitive to such decisions. In one approach, a virtualization layer may try to free guest memory on a least recently used (LRU) basis, which the virtualization layer may try to determine by continuously collecting memory access statistics. However, statistics collection adds performance overhead and requires additional memory to store and process. Furthermore, such an approach is generally probabilistic and cannot definitively determine that the freed guest memory is not being used by the virtual machine.
Another approach to memory management is memory ballooning. A special balloon driver installed in the guest operating system “inflates” its memory use by allocating pages of guest physical memory, and signals to the virtualization layer which particular regions of memory were given to the driver by the guest operating system. The virtualization layer then frees those regions in memory, which the system now knows is not used by any guest except the balloon driver. However, the ballooning technique requires some time to take action, often reacting too slowly to the system's state. Further, the balloon driver is unable to identify situations in which it would be better to stop taking memory from a guest system.