A virtual machine involves—a “virtualization”—in which an actual physical machine is configured to implement the behavior of the virtual machine. Multiple virtual machines (VMs) can be installed on a physical host machine, referred to as a ‘host’, which includes physical system hardware that typically includes one or more physical processors (PCPUs) and physical memory and various other physical devices, such as an IO storage adapter to perform protocol conversions required to access a remote storage such as over a storage access network (SAN). A VM typically will have both virtual system hardware and guest system software including virtual drivers used for various virtual devices. The virtual system hardware ordinarily includes one or more virtual processors, virtual memory, at least one virtual disk, and one or more virtual devices all of which may be implemented in software using known techniques to emulate the corresponding physical components. One or more layers or co-resident software components comprising a virtualization intermediary, e.g. a virtual machine monitor (VMM), hypervisor or some combination thereof that acts to instantiate and provision VMs and to allocate host resources dynamically and transparently among the VMs so that their respective guest operating systems can run concurrently on a single physical machine.
Interrupts are used in modern computing systems for a variety of purposes including, by way of example, to notify processors of external events and to facilitate communication between processors of a multiprocessor system. Typically, an interrupt interrupts normal processing and temporarily diverts flow of control to an interrupt service routine (“ISR”). Various activities of a computing system can trigger interrupts. Some examples are reading or writing from a data storage device and receiving a network packet. Computing systems typically comprise one or more interrupt controllers that direct and arbitrate the flow of interrupts in a system. Interrupt controllers are responsible for prioritizing incoming interrupts and directing them to the appropriate processor in a multiprocessor system. An interrupt controller may be realized in hardware and as such may comprise a discrete component or may be integrated with processors.
Interrupt controllers also may be virtualized. This is typically accomplished through a combination of software and virtualization assists provided by hardware. The software may be a part of a virtual machine monitor that performs the same basic functions as a physical interrupt controller. Typically, a VMM accepts physical interrupts and redirects them to guest operating systems as virtual interrupts.
High input/output (IO) rate applications such as datacenter applications can issue hundreds of very small IO operations in parallel resulting in tens of thousands of IOs per second (IOPS). For high IO rates, the processor overhead for handling all the interrupts can become quite high and eventually can lead to lack of processor resources for the application itself. Processor overhead can be even more of a problem in virtualization scenarios where many virtual machines run on a single multi-processor system, for example.
Traditionally, interrupt coalescing or moderation has been used in IO storage controller cards to limit the number of times application execution is interrupted by the device to handle IO completions. Interrupt coalescing may involve dropping an interrupt so that it is never delivered or delaying delivery of an interrupt. Interrupt coalescing techniques generally balance an increase in IO latency with the improved execution efficiency that can be achieved through a reduction in the number of interrupts. In hardware controllers, fine-grained timers have been used in conjunction with interrupt coalescing to keep an upper bound on the latency of IO completion notifications. However, such timers can be inefficient to use in a hypervisor.
One proposed virtualized system using virtualized interrupts includes a hypervisor (e.g., VMKernel) and a virtual machine monitor (VMM). See, I. Ahmad, A. Gulati, and A. Mashtizadeh, “Improving Performance with Interrupt Coalescing for Virtual Machine Disk I/O in VMware ESX Server,” Second International Workshop on Virtualization Performance Analysis, Characterization, and Tools (VPACT '09), held with IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2009. The hypervisor controls access to physical resources among virtual machines (VMs) and provides isolation and resource allocation among virtual machines running on top of it. The VMM is responsible for correct and efficient virtualization of the processor instruction set architecture as well as common, high performance devices made available to the guest. The VMM is the conceptual equivalent of a “process” to the hypervisor. The VMM intercepts the privileged operations from a VM including IO and handles them in cooperation with the hypervisor.
FIG. 1 is an illustrative drawing showing flow of an interrupt that services an IO completion in a computing system, which includes a virtual machine (VM) running on virtual machine monitor (VMM) and a hypervisor that reside on a host machine that is coupled to a physical IO storage adapter. A hypervisor executing storage stack code on the physical processor is shown on the right and an example VM (and Guest operating system) running on top of its virtual machine monitor (VMM) running on the processor is shown on the left. An interrupt is received from an IO storage adapter during a first stage 102 of the flow. During a second stage 104, appropriate code in the hypervisor is executed to handle the IO completion all the way up to a vSCSI subsystem which narrows the IO to a specific VM. In a third stage 106, the hypervisor posts IO completions in a queue in which each VMM shares a common memory area with the hypervisor. During a fourth stage 108, the hypervisor may issue an inter-processor interrupt or IPI to notify the VMM. During a fifth stage 110, the VMM can pick up the completions on its next execution and process them in a sixth stage 112. A virtual interrupt is fired in a seventh stage 114.
In this proposed system, improved performance of high-IOPS workloads, in terms of higher throughput and lower CPU utilization, can be achieved through interrupt coalescing to allow for some batching of I/O completions to happen at the guest level. In order to avoid the undesirable side-effect of increasing latency, this coalescing only occurs when I/O levels exceed a threshold. This threshold can be set high enough that performance does not degrade under trickle I/O or latency-bound conditions.