Virtual machine high-availability (“VM-HA”) is a category of virtualization technologies and products that provide for the replication, or “mirroring,” of virtual machines (“VMs”), usually across different physical servers or hosts. For example, a primary virtual machine running on one host may be replicated on a secondary host. If the primary virtual machine goes down, because of a hardware failure, an operating system failure, or some other failure, the replicated virtual machine on the secondary host may assume the role of the failed virtual machine. In other examples, the virtual machines may be running on the same physical server and/or each virtual machine may operate as both primary and secondary (replicated) virtual machines. VM-HA provides high availability of the services provided by the virtual machines and reduces downtime for consumers of the services in the event of a failure. In addition, because the replication is performed at the virtual machine level, VM-HA may provide this high availability independent of and transparent to the operating systems and applications executing in the virtual machines, and does not require application-specific configuration or synchronization methods.
A virtual machine monitor (“VMM”) executing on the hosts in the VM-HA environment may monitor the activity of the primary virtual machine and may replicate the state of the virtual machine between a primary computer system and a secondary computer system, while maintaining state consistency. In the instance that a failure of the primary virtual machine is detected, the replicated virtual machine on the secondary computer system can take over operations, thus allowing the replicated virtual machine to provide the services of the failed primary computer system with minimal downtime. Various techniques exist for the VMM to replicate state across virtual machines. For example, the VMM may transfer the CPU and virtual device inputs from the primary virtual machine to the replicated virtual machine to be replayed, ensuring that the replicated virtual machine is continuously synchronized with the primary virtual machine. However, because all non-deterministic events of the primary virtual machine must also be replicated to the secondary virtual machine, such replication techniques may affect the performance of the primary virtual machine.
In another example, the VMM may utilize a technique analogous to “live migration,” in which the state as embodied in the memory of the primary virtual machine is replicated to the replicated virtual machine on the secondary host in real-time, while the primary virtual machine continues to run on the primary host. During replication, writes to the memory of the primary virtual machine by the guest operating system (“OS”) and associated applications are trapped by the VMM, and “dirtied” pages of guest memory are copied to the replicated virtual machine on the secondary host. At certain intervals, or “checkpoints,” the execution of the guest OS and applications on the primary virtual machine is suspended, and the remaining dirty pages of guest memory are copied to the replicated virtual machine along with the current CPU state. While the impact on performance of the primary virtual machine is minimal with the live migration technique of replication, replication of the pages of guest memory between the primary virtual machine and the replicated virtual machine on the secondary host may generate excessive network traffic and may be inefficient.
It is with respect to these and other considerations that the disclosure made herein is presented.