Virtualization allows multiple virtual machines (guest operating systems and the applications supported by each guest OS) to be supported on a single physical hardware platform (“hardware platform”, hereafter). To facilitate virtualization, a new layer referred to as virtual machine monitor (VMM) or hypervisor is provided between the guest OS and the hardware platform. Data centers may include a multiple number of similar computing platforms (such as same configuration of hardware computers or computers with same components such as PCIe devices). Virtual machine(s) may be moved or migrated from a source to a target destination within the same platform or from one computing platform (source platform) to another computing platform (target platform). During migration, the state of the VM (that is the CPU state, memory, and the I/O state) is migrated from a source platform to a target platform. The I/O state migration may be achieved by VMM if the virtual I/O is emulated by VMM software. Typically, virtual machine migration may be performed to improve sharing and utilization to balance loads, handle hardware failover, save energy, and migrate from one geography to other, for example, among multiple computing platforms of a data center. Migration may be performed offline or on-line. Offline migration refers to suspending a virtual machine (hence the service is shutdown) in the source platform and saving the states and the VM may be resumed (with the saved state) some time later in the target platform. Live migration refers to migration of a virtual machine from the source to the target without significant service shut down time. Offline and live migration of virtual machine are important benefits of virtualization, especially, in cloud computing environment and high availability usage model.
Passthrough (or direct I/O) devices, generally, refer to a set of devices that are coupled to the virtualized guests and allow the virtualized guests to have exclusive access to the coupled passthrough devices. The passthrough devices appear and behave as if they are physically coupled to the guest operating system, while they are not. Passthrough devices may include devices that support single root-I/O virtualization (SR-IOV) specification and multiple root-I/O virtualization (MR-IOV) specification. A SR-IOV capable device is a Peripheral Component Interconnect-Express (PCIe) device, which may be managed to create multiple virtual functions (VFs). A SR-IOV capable device may include single or multiple physical functions (PFs) and each PF is a standard PCIe function and is associated with multiple VFs. The VFs may have the ability to move data in and out and may be configured and managed by the associated PF. On the other hand, a MR-IOV may provide multiple servers to share interconnect devices such as host bus adapter (HBA) or an Ethernet based network interface card (NIC) or a video capture card. MR-IOV is a multi-server extension to SR-IOV. However, live migration imposes a challenge in passthrough devices, especially, while migrating virtual functions (VFs) supporting SR-IOV and MR-IOV. In such passthrough devices, virtualization performance is achieved by assigning dedicated virtual functions (VFs) to dedicated VMs. In such scenarios, some tasks such as device I/O access and direct memory access (DMA) may be supported by providing a direct path between the VMs and the hardware platform without VMMs intervention. Also, it may be feasible to migrate one virtual function from one hardware platform to another hardware platform from a hardware point of view if the target platform supports similar VFs as that of the source platform. However, from the software perspective, such a migration of virtual function imposes challenges as virtual machine monitor (VMM) does not have hardware (or device) specific knowledge to save and restore device states. Furthermore, some device states (invisible states) may be invisible to the software and some such invisible device states uniquely identify the status of the virtual function. As the invisible states are not available to the software, the migration of a virtual function(s) from a source platform to target platform may impose a challenge.
The current hypervisors in the computing platforms may support migration of virtual functions from a source platform to a target platform in a limited manner. The hypervisors may use bonding drivers in the guest OS to bond an assigned network card (ANIC) driver with a software emulated virtual network card (VNIC) driver and switching between these drivers is performed based on demand. The ANIC driver may run as a master driver at runtime. For example, when the migration happens, a virtual hot plug removal event may be delivered to the guest to unplug ANIC device and the bonding driver may switch the network service to the VNIC driver to maintain network connectivity. Such an approach may be termed as Mobile Pass through (MPT).
However, MPT offers limited migration capabilities in SR-IOV and such limited capabilities may not be suitable in a cloud computing environment such as the environment 100. The cloud computing environment may be a private cloud such as an enterprise data center or a public cloud. For example, the virtual hot plug removal event may lead to several challenges in a cloud computing environment namely (1) the cloud users may experience a degraded user experience during a virtual hot plug removal event while using infrastructure as a service (IAAS) and platform as a service (PAAS); (2) the service level agreement (SLA) performance of the guest OS, which participate in migration may be impacted as the guest OS may become slow while responding to the transactions in time. Such degradation in response, while performing migration involving VF may, for example, last for 5-10 seconds; (3) Rapid VM checkpoint based high availability is affected substantially with MPT and such non-availability may not be suitable in a cloud computing environment; (4) MPT makes the migration process dependent on the guest OS and if the guest OS participating in migration is busy or tampered, the migration process may not be completed; (5) Legacy (guest) OS may not support virtual hot-plug event; and (6) Legacy (guest) OS may not support bonding drivers. Thus, there is a need for a migration technique that is efficient and offers high availability of the platforms in a cloud computing environment.