In a virtualized computing system wherein a single virtual machine CPU (VCPU) of a virtual machine (VM) runs on a single host CPU, a device, such as a host device may generate an interrupt or an inter-processor interrupt may be received from a remote CPU of another host. An interrupt is a signal to the CPU or VCPU or an instruction in software that a device which produced the interrupt needs immediate attention. An interrupt signals the CPU or VCPU of a high-priority condition requiring the interruption of the current code that the CPU or VCPU is executing. The CPU or VCPU typically responds by suspending its current activities, saving its state, and executing a small program called an interrupt handler (interrupt service routine, ISR) to deal with the event. This interruption is temporary, and after the interrupt handler finishes, the processor resumes execution of the previous thread.
There are generally two types of interrupt. A hardware interrupt is an electronic alerting signal to the CPU from an external device, either a part of the computer itself such as a disk controller or an external peripheral. For example, pressing a key on the keyboard or moving the mouse triggers hardware interrupts that cause the CPU to read the keystroke or mouse position. Unlike the software type, hardware interrupts are asynchronous and can occur in the middle of instruction execution, requiring additional care in programming. The act of initiating a hardware interrupt is referred to as an interrupt request (IRQ).
A software interrupt is usually caused either by an exceptional condition in the CPU itself, or a special instruction in the instruction set which causes an interrupt when it is executed. The former is often called a trap or exception and is used for errors or events occurring during program execution that are exceptional enough that they cannot be handled within the program itself. For example, if the processor's arithmetic logic unit is commanded to divide a number by zero, this impossible demand will cause a divide-by-zero exception, perhaps causing the CPU to abandon the calculation or display an error message.
Each interrupt typically has its own interrupt handler. The number of hardware interrupts can be limited by the number of interrupt request (IRQ) lines to the CPU, but there may be hundreds of different software interrupts.
If implemented in hardware, an interrupt controller circuit such as a Programmable Interrupt Controller (PIC) may be connected between the interrupting device and the CPU's interrupt pin to multiplex several sources of interrupt onto the one or two CPU lines typically available.
A message-signaled interrupt usually does not use a physical interrupt line. Instead, a device can signal its request for service by sending a short message over some communications medium, typically a computer bus. Rather than using a special message type reserved for interrupts, message-signaled interrupts usually use a memory write message type. PCI computer buses (including serial PCI express and parallel PCI and PCI-X bus types) can use message-signaled interrupts.
PCI devices typically use special messages, called MSI or MSI-X capability structures, to allow operating system software to enable a device to assert an interrupt by means of a message-signaled interrupt. Message-signaled interrupts can allow the device to write a small amount of data to a special address in memory space (e.g., in a message capability register of a PIC). The PIC can deliver the corresponding interrupt to a CPU.
PCI defines two optional extensions to support message-signaled interrupts, MSI and MSI-X. While PCI software is compatible with legacy interrupts, it uses MSI or MSI-X. MSI (first defined in PCI 2.2) permits a device to allocate 1, 2, 4, 8, 16 or 32 interrupts. The device is programmed with an address to write to (e.g., the message address field/register of the message capability register of a PIC), and a 16-bit data word to identify the specific interrupt (e.g., the message data fields/registers of the message capability register of a PIC). The interrupt number is added to the data word to identify the interrupt. Some platforms such as Windows may not use 32 interrupts but rather use up to 16 interrupts.
MSI-X (first defined in PCI 3.0) permits a device to allocate up to 2048 interrupts. The address used by original MSI was found to be restrictive for some architectures. MSI-X allows for a larger number of interrupts and gives each one a separate target address and data word. Devices with MSI-X may not necessarily support 2048 interrupts but typically support at least 64 which is double the maximum MSI interrupts.
In a virtualized environment, a virtual machine does not have direct access to the PIC of the host. Events signaled from a physical device to a guest of a virtual machine are handled by an intervening software layer providing the virtualization, commonly referred to as a hypervisor (also known as a virtual machine monitor (VMM)).
The hypervisor emulates a virtual programmable interrupt controller (VPIC). When the guest makes a request to the VPIC to program an emulated MSI/MSI-X virtual device, the hypervisor traps the request and programs the PIC of the host with fields from an MSI/MSI-X capability table to be written to the MSI/MSI-X capability register of a corresponding hardware device of the host.
When the hardware device of the host writes an MSI/MSI-X message containing an MSI/MSI-X capability table to the specified address in the PIC to raise a MSI/MSI-X interrupt, the hypervisor, in turn, traps the message and forwards the MSI/MSI-X capability table to the specified address in the VPIC, which is then handled by the guest.
Unfortunately, the guest is not aware of configurations stored in MSI/MSI-X capability registers of a PIC or VPIC and what addresses the PIC of the host has selected for a specific device. Accordingly, both the indirect programming of a real device of a host by a guest and subsequent servicing of an MSI/MSI-X event raised by the real device of a host by a VPIC of guest typically result in significant overhead and processing time.