Computer virtualization is a technique involved in creation of a virtual machine that acts like a physical computing machine with an operating system, and a computer virtualization architecture is generally defined by the ability to concurrently support multiple operating systems on a single physical computer platform. For example, a computer that is running Microsoft Windows may host a virtual machine with a Linux operating system. A host machine is an actual physical machine on which the virtualization takes place, while a virtual machine is considered as a guest machine. A hypervisor, literally referred to as a virtual machine monitor (VMM), is a software layer that virtualizes hardware resources and presents a virtual hardware interface to at least one virtual machine. The hypervisor resembles to the way that a traditional operating system manages the hardware resources for processing and performs certain management functions with respect to an executing virtual machine. The virtual machine may be referred to as a “guest” and the operating system running inside the virtual machine may be referred to as a “guest OS”.
The virtualized environment is currently memory-bound, which means that the physical memory of the host machine is the bottleneck of the resource utilization in a data center. Memory virtualization decouples the physical memory resources from the data center and then aggregates the resources into a virtualized memory pool which is accessible to the guest OS or applications running on top of the guest OS. In terms of memory virtualization, memory compression is one of the crucial topics to the memory resource management and utilization.
Similar to the traditional operating system, the last resort to increase memory utilization of the hypervisor is to reclaim the memory from the virtual machine by host swapping, i.e., to shift the memory pages of virtual machines to a physical swap disk, referred to as swap-out, mark a corresponding page table entry (PTE) of the virtual machines' physical address to machine address (P2M) table to be not-present, and then free the corresponding page to the free memory pool of the hypervisor, wherein the page table is a data structure used by the virtual machines to store the mapping between the virtual addresses and physical addresses. Later on, if the page is accessed again by the virtual machine, a page fault is triggered and the copy-on access (COA) mechanism is performed to bring the page content from a swap disk into a newly allocated memory page, referred to as swap-in. However, the overhead is highly unsatisfactory due to the long latency of disk input/output (I/O).
As another way to increase the memory utilization, memory compression may be done by compressing swapped-out pages of the virtual machines into smaller size of data and putting them together in a memory to save the physical memory disk used to store the original content. The idea is that the swapin from compressed memory would be faster than the swapin from the disk because the memory access is faster than the disk access.
Nonetheless, the memory compression is mostly considered as a secondary choice because it not only causes the COA, which triggers hardware trap and stops the current application execution, but also consumes the processor cycles of the host machine to compress and decompress the page content and incurs more overhead. Hence, the ideal situation is to avoid compression for the memory pages that are frequently accessed by the guest OS (i.e., the working set), but to find out the idle memory pages (i.e., the guest memory pages outside of the working set) for memory compression.