In virtual machine technology, a user can create and run multiple operating environments on a server at the same time. Each operating environment, or virtual machine, requires its own operating system (OS) and can run software applications independently from the other virtual machines. Virtual machine technology provides many benefits as it can lower information technology (IT) costs through increased efficiency, flexibility and responsiveness. Each virtual machine acts as a separate environment that reduces risks and allows developers to quickly recreate different OS configurations or compare versions of applications designed for different OSs. At the same time, when many virtual machines are running on a single server, for example, it is important to maintain the virtual machines in an idle state for as much time as possible to decrease energy use and consumption of other computing system resources.
Generally, every modern (and most legacy) operating systems have a computing processing unit (“CPU”) or process scheduler. During operation, the scheduler chooses a process and active threads from the process that are in a “ready to execute” state. When a new process is selected for execution, the scheduler initiates process and thread context switch to save the state of previous executed thread and then load a new thread. After the context switch, the “ready to execute” begins execution. At this point, the CPU executes code of the running thread until there is either a synchronous call to wait for some resource (or synthetic synchronization object) to be released or until there is an asynchronous hardware interrupt. For example, synchronous calls may be a Win32 API WaitForMultipleObject( ) or POSIX pthread_cond_wait( ) or other similar calls that causes the processor to enter a blocked state while it waits for some resource to be released or some conditional event to occur. Moreover, hardware interrupt events can happen, for example, as a result of some embedded device operation (e.g., a local APIC timer event, an IPI sent from one CPU to another CPU, or the like) or external peripheral device functioning (e.g., complete read operation from a hard disk).
In existing operating systems, when the process thread is in a state of waiting for some resource or condition, the thread will be moved to a “wait” state by the scheduler and the thread will remain in a blocked state until the requested resource becomes available. For a hardware interrupt, an OS interrupt handler will start execution and call a specific device driver (e.g., a kernel module extension) interrupt handler to perform the necessary interrupt processing for the device. The scheduler will then check the time quantum dedicated to the current process to ensure the time has not expired and, if so, it will pass control to the OS to return from interrupt routine to repair the interrupted thread context. However, if the time quantum has expired during the hardware interrupt event, the scheduler will push the interrupted thread from the “running” state to the “ready to execute” state in the back of the queue, select the next “ready to execute” thread, and then switch the context to the new thread.
To save energy and computing resources, the OS will normally enter an idle state when all (or most) of the processes and threads are in one of the blocked states and waiting for some condition. Usually, in hardware, this is implemented as a special processor state where the OS moves the CPU by executing a special instruction or series of instructions. For example, on the Intel IA-32, Intel 64 and AMD64 platform, a HLT (“halt”) instruction is usually used for the guest OS to move the processor to the idle state. This instruction will put the processor to a low energy consumption state and will wait until some subsequent hardware interrupt happens before leaving the idle state. Sometimes this is done by a MONITOR/MWAIT instruction. Moreover, for ARM platform, the guest OS moves the processor to the idle state based on WFE or WFI instructions.
In a virtualized environment, the virtual machine monitor (“VMM”) or hypervisor will emulate the corresponding instructions and place the virtual processor execution to the blocked state to wait for an emulated asynchronous event or timeout until some time-based event (e.g., an emulated timer device interrupt). One technical issue with this configuration is that the timeout for the sleep/blocked state of a virtual processor is calculated in accordance with the nearest emulated hardware time event (e.g., an interrupt of emulated PIT, local APIC timer, CMOS, or the like). Moreover, each virtual processor sleep and wake up transition takes time to execute additional state transition codes. Furthermore, when the guest OS is in a deep idle state (especially for conventional non-tickless OSs), the timer interrupt is raised only to increment certain counters, check for any unblocked processes and threads, and then pass the guest OS back to the sleep state again, by using the HLT or MONITOR/MWAIT instructions again, for example. Each transition into and out of the sleep state consumes useless time (i.e., in long sleep states without active threads) to execute guest interrupt handlers and driver codes, switcher processor states and, in case of virtualization, emulate excessive behavior.