In a traditional operating system (OS) such as Linux, user-level programs request for system services by making system calls. Similarly, in a hypervisor known as virtual machine manager or monitor (VMM) such as Xen, a guest operating system uses hypercalls to request services from a hypervisor. To simplify the design of a hypervisor, Xen put all its device drivers and designated important system daemons into a special privileged domain, called domain 0. Because there is no thread support in the Xen hypervisor space, domain 0 is the only choice for the system daemon to run.
It is different from the traditional Linux, which may run system daemons as kernel threads, for example, network file system (NFS) daemon handles network packets and file system structures inside a Linux kernel. There are two of significant features for both developers and the system performance. One is that the kernel thread may easily access kernel data structures. The other is that the kernel thread has its own process address space, and may be scheduled or context switched as normal process. Unlike the Linux kernel thread, domain 0 may not easily access or modify data structures in hypervisor, but need to request services from the hypervisor to do the job. For domain 0 or operating system in a virtual machine (VM), referred to as guest OS, to request services from a Xen hypervisor, a hypercall application program interface (API) in Xen provides the similar functionality as the system call in the typical OS kernel. Certain services include retrieving important hypervisor data structures, allocating resources for non-privileged VMs, performing I/O requests, and etc. Nevertheless, this interface does not scale well when the system daemon requests a large number of services, i.e., numerous hypercalls. Because each hypercall has extra overhead from the switch between guest OS and hypervisor, the daemon or system performance may suffer if the daemon issues the hypercall one by one.
Nowadays, a guest operating system may choose to either issue the hypercalls one by one, or send them as a batch and block until all of them are completed. A system daemon that wants to request a service from a hypervisor has to use a hypercall API provided by a hypervisor. The multicall API is designed to enable a guest OS to submit a sequence of hypercalls in one shot, thus reducing the number of context switches between the guest OS and the hypervisor. This multicall API could reduce the overall hypercall overhead. However, each multicall is synchronous, which means that the caller and the related virtual central processing unit (VCPU), referred to as VCPUh, will block until all hypercalls inside the multicall are finished. As shown in FIG. 1, large amount of hypercalls in a virtual machine VM-X will block other VM such as VM-Y from running, because the hypervisor did not switch context of hypercalls, for example H2-H8, issued by VM-X during a certain time slice. In addition, the interface is designed to run all calls serially, and a multicall may only utilize the physical CPU (PCPU) resource on which the VCPUh is scheduled, even when the guest domain is assigned multiple VCPUs that could run on multiple PCPUs.
Some schemes may issue a deferrable function call to defer the work consisting of hypercall routines. Deferring the work may be implemented by several ways such as adopted in Linux interrupt handler and device driver, asynchronous, executed when Xen is idle, etc.
Some reference publications may address issues or provide methods to improve system performance in a virtual machine environment. For example, one reference publication disclosed a method for attenuating spin waiting of virtual processors in a virtual machine environment so that the virtual processors may obtain extra time slice extensions when accessing a synchronization section. This method addresses a scheduling issue in a virtualization environment. Another reference publication disclosed a message receiving method for a message passing interface (MPI) in a virtual machine over-allocation environment. The message receiving method is independent of virtual machine layer dispatching mechanism. By modifying the message receiving mechanism of MPI bank, the method may improve the system performance by coordinating the two dispatching mechanisms in the virtual environment, i.e., client operating system dispatching process to virtual processor, and virtual machine dispatching manager dispatching virtual processor to physical processor.