Hardware virtualization is the process of creating of a virtual machine that acts like a computer with an operating system. Software executed on these virtual machines is typically separated from the underlying hardware resources. A hypervisor is a program that allows guest virtual machines to run concurrently on a host computer. The hypervisor presents to the guests a virtual operating platform and manages the execution of the guest operating systems. Thus, multiple instances of a variety of operating systems may share the virtualized hardware resources.
In the prior art there are described virtualization architectures having a hypervisor that are further extended to expose the hardware of the system to upper layers. Such extensions involve the use of, for example, nested virtualization where above a virtual platform an additional level of virtualization takes place. A typical nested virtualization environment includes three layers of virtualization over the hardware infrastructure: a host hypervisor, guest hypervisors, and VMs. Each of the guest hypervisors controls the execution of the plurality of VMs. In this architecture, each VM can execute one or more guest operating system (although VMs can execute also without having any guests too). The problem in such virtualization architecture is that this approach is very slow as many software components are involved in the execution of a guest OS or any application executed by the VM.
As the guest OS runs in a limited memory address space, there is not enough space to access the full hardware, thus hardware emulation is required resulting in a significantly slower execution. For example, in the event the hypervisor needs to respond to a system call by a guest requires moving from one address space to another, traps are utilized for the purpose which results in duplication of execution environments. This happens because the move from one address space to another also involves a multitude of traps that require additional processing and hinder performance. Moreover, as hardware emulation in software is required, the overall performance is further reduced.
Typically, a trap initiates a full operation that relinquishes control from the guest OS and transfers the control to the hypervisor. This involves, for example, switching from execution in Ring 0 to execution in Ring 3, which entails significant overhead. The execution takes place at the hypervisor level and then needs to relinquish control to the guest, which again involves an overhead to reach back for Ring 0 execution. Rings or protection rings are hierarchical protection domains utilized to protect data and functionality from faults and malicious actions. Each protection provides different levels of access to hardware/software resources. In a typical operating system, the most privileged is the kernel or Ring 0 which interacts directly with the physical hardware (e.g., the CPU and memory), while the least privileged is Ring 3.
To further appreciate the complexity of handling the move from one level to another, one may also consider the case of a page fault at the guest. A page fault typically results in an exception to the firmware of the guest and from there an exception to the kernel moving to a different ring. Each such operation is very costly in terms of performance. One of the problems in handling page faults this way is the fact that there is no data of the guest OS in kernel (Ring j), a potentially risky proposition that is solved at times by using segmentation limits. That way the user cannot see the data that is in the kernel.
However, such support is not generally or otherwise efficiently available in modern 64-bit processors, and hence workarounds are needed. To this end, a limited number of instructions are available (typically for an X86® architecture being some thirteen instructions), however, the need for the monitoring of the workarounds when they occur results in significant overhead.
Typical prior art solutions first check for all places in the code where it will be necessary to move between the guest and the hypervisor; such code is typically replaced by using a jump command. This is necessary because prior art solutions specifically deter from the execution of the kernel of the guest in the same security ring of that of an application executed by the guest. Therefore, prior art solutions typically execute at the kernel and the application of the guest at the same security ring, for example, Ring 3, while the hypervisor is being executed, for example, in Ring 0. An exemplary case for a long jump from the hypervisor and the kernel as well as application of the guest is shown in FIG. 1.
It would be therefore advantageous to provide a solution that overcomes the deficiencies of the prior art. It would be further advantageous if such a solution maintains the security requirements of the various rings of the operating system.