1. Field of the Invention
This invention relates to a computer architecture, including a virtual machine monitor, and a related operating method that allow virtualization of the resources of a modern computer system.
2. Description of the Related Art
The operating system plays a special role in today""s personal computers and engineering work stations. Indeed, it is the only piece of software that is typically ordered at the same time the hardware itself is purchased. Of course, the customer can later change operating systems, upgrade to a newer version of the operating system, or even re-partition the hard drive to support multiple boots. In all cases, however, a single operating system runs at any given time on the computer. As a result, applications written for different operating systems cannot run concurrently on the system.
Various solutions have been proposed to solve this problem and eliminate this restriction. These include virtual machine monitors, machine simulators, application emulators, operating system emulators, embedded operating systems, legacy virtual machine monitors, and boot managers. Each of these systems is described briefly below.
One solution that was the subject of intense research in the late 1960""s and 1970""s came to be known as the xe2x80x9cvirtual machine monitorxe2x80x9d (VMM). See, for example, R. P. Goldberg, xe2x80x9cSurvey of virtual machine research,xe2x80x9d IEEE Computer, Vol. 7, No. 6, 1974. During that time, moreover, IBM Corp. adopted a virtual machine monitor for use in its VM/370 system.
A virtual machine monitor is a thin piece of software that runs directly on top of the hardware and virtualizes all the resources of the machine. Since the exported interface is the same as the hardware interface of the machine, the operating system cannot determine the presence of the VMM. Consequently, when the hardware interface is compatible with the underlying hardware, the same operating system can run either on top of the virtual machine monitor or on top of the raw hardware.
Virtual machine monitors were popular at a time where hardware was scarce and operating systems were primitive. By virtualizing all the resources of the system, multiple independent operating systems could coexist on the same machine. For example, each user could have her own virtual machine running a single-user operating system.
The research in virtual machine monitors also led to the design of processor architectures that were particularly suitable for virtualization. It allowed virtual machine monitors to use a technique known as xe2x80x9cdirect execution,xe2x80x9d which simplifies the implementation of the monitor and improves performance. With direct execution, the VMM sets up the processor in a mode with reduced privileges so that the operating system cannot directly execute its privileged instructions. The execution with reduced privileges generates traps, for example when the operating system attempts to issue a privileged instruction. The VMM thus needs only to correctly emulate the traps to allow the correct execution of the operating system in the virtual machine.
As hardware became cheaper and operating systems more sophisticated, VMMs based on direct execution began to lose their appeal. Recently, however, they have been proposed to solve specific problems. For example, the Hypervisor system provides fault-tolerance, as is described by T. C. Bressoud and F. B. Schneider, in xe2x80x9cHypervisor-based fault tolerance,xe2x80x9d ACM Transactions on Computer Systems (TOCS), Vol. 14. (1), February 1996; and in U.S. Pat. No. 5,488,716 xe2x80x9cFault tolerant computer system with shadow virtual processor,xe2x80x9d (Schneider, et al.). As another example, the Disco system runs commodity operating systems on scalable multiprocessors. See xe2x80x9cDisco: Running Commodity Operating Systems on Scalable Multiprocessors,xe2x80x9d E. Bugnion, S. Devine, K. Govil and M. Rosenblum, ACM Transactions on Computer Systems (TOCS), Vol. 15, No. 4, November 1997, pp. 412-447.
Virtual machine monitors can also provide architectural compatibility between different processor architectures by using a technique known as either xe2x80x9cbinary emulationxe2x80x9d or xe2x80x9cbinary translation.xe2x80x9d In these systems, the VMM cannot use direct execution since the virtual and underlying architectures mismatch; rather, they must emulate the virtual architecture on top of the underlying one. This allows entire virtual machines (operating systems and applications) written for a particular processor architecture to run on top of one another. For example, the IBM DAISY system has recently been proposed to run PowerPC and x86 systems on top of a VLIW architecture. See, for example, K. Ebcioglu and E. R. Altman, xe2x80x9cDAISY: Compilation for 100% Architectural Compatibility,xe2x80x9d Proceedings of the 24th International Symposium on Computer Architecture, 1997.
Machine simulators, also known as machine-emulators, run as application programs on top of an existing operating system. They emulate all the components of a given computer system with enough accuracy to run an operating system and its applications. Machine simulators are often used in research to study the performance of multiprocessors. See, for example, M. Rosenblum, et al., xe2x80x9cUsing the SimOS machine simulator to study complex computer systems,xe2x80x9d ACM Transactions on Modeling and Computer Simulation, Vol. 7, No. 1, January 1997. They have also been used to simulate an Intel x86 machine as the xe2x80x9cVirtualPCxe2x80x9d or xe2x80x9cRealPCxe2x80x9d products on a PowerPC-based Apple Macintosh system.
Machine simulators share binary emulation techniques with some VMMs such as DAISY. They differentiate themselves from VMMs, however, in that they run on top of a host operating system. This has a number of advantages as they can use the services provided by the operating system. On the other hand, these systems can also be somewhat constrained by the host operating system. For example, an operating system that provides protection never allows application programs to issue privileged instructions or to change its address space directly. These constraints typically lead to significant overheads, especially when running on top of operating systems that are protected from applications.
Like machine simulators, application emulators also run as an application program in order to provide compatibility across different processor architectures. Unlike machine simulators, however, they emulate application-level software and convert the application""s system calls into direct calls into the host operating system. These systems have been used in research for architectural studies, as well as to run legacy binaries written for the 68000 architecture on newer PowerPC-based Macintosh systems. They have also been also been used to run x86 applications written for Microsoft NT on Alpha workstations running Microsoft NT. In all cases, the expected operating system matches the underlying one, which simplifies the implementation. Other systems such as the known Insigna""s SoftWindows use binary emulation to run Windows applications and a modified version of the Windows operating system on platforms other than PCs. At least two known systems allow Macintosh applications to run on other systems: the Executer runs them on Intel processors running Linux or Next and MAE runs them on top of the Unix operating system.
Operating system (OS) emulators allow applications written for one given operating system application binary interface (ABI) to run on another operating system. They translate all system calls made by the application for the original operating system into a sequence, of system calls to the underlying operating system. ABI emulators are currently used to allow Unix applications to run on Window NT (the Softway OpenNT emulator) and to run applications written for Microsoft""s operating systems on public-domain operating systems (the Linux WINE project).
Unlike virtual machine monitors and machine simulators, which are essentially independent of the operating system, ABI emulators are intimately tied with the operating system that they are emulating. Operating system emulators differ from application emulators in that the applications are already compiled for the instruction set architecture of the target processor. The OS emulator does not need to worry about the execution of the applications, but rather only of the calls that it makes to the underlying operating system.
Emulating an ABI at the user level is not an option if the goal is to provide additional guarantees to the applications that are not provided by the host operating system. For example, the VenturCom RTX Real-Time subsystem embeds a real-time kernel within the Micrcsoft NT operating system. This effectively allows real-time processes to co-exist with traditional NT processes within the same system.
This co-existence requires the modification of the lowest levels of the operating system, that is, its Hardware Abstraction Layer (HAL). This allows the RTX system to first handle all I/O interrupts. This solution is tightly coupled with WindowsNT, since both environments share the same address space and interrupts entry points.
Certain processors, most notably those with the Intel architecture, contain special execution modes that are specifically designed to virtualize a given legacy architecture. This mode is designed to support the strict virtualization of the legacy architecture, but not of the existing architecture.
A legacy virtual machine monitor consists of the appropriate software support that allows running the legacy operating system using the special mode of the processor. Specifically, Microsoft""s DOS virtual machine runs DOS in a virtual machine on top of Microsoft Windows and NT. As another example, the freeware DOSEMU system runs DOS on top of Linux.
Although these systems are commonly referred to as a form of virtual machine monitor, they run either on top of an existing operating system, such as DOSEMU, or as part of an existing operating system such as Microsoft Windows and Microsoft NT. In this respect, they are quite different from the true virtual machine monitors described above, and from the definition of the term xe2x80x9cvirtual machine monitorxe2x80x9d applied to the invention described below.
Finally, boot managers such as the public-domain LILO and the commercial System Commander facilitate changing operating systems by managing multiple partitions on the hard drive. The user must, however, reboot the computer to change operating systems. Boot managers therefore do not allow applications written for different operating systems to coexist. Rather, they simply allow the user to reboot another operating system without having to reinstall it, that is, without having to remove the previous operating system.
All of these known systems fail to meet one or more of the following goals:
1) to provide the advantages offered by traditional virtual machine monitors, such as the ability to run multiple arbitrary operating systems concurrently.
2) to provide portability of the virtual machine monitor across a wide range of platforms; and
3) to maximize performance by using the underlying hardware as much as possible.
This invention provides a system that achieves these goals, as well as a related operating method.
The invention virtualizes a computer that includes a host processor, memory, and physical system devices. A conventional operating system (referred to below as the xe2x80x9chost operating systemxe2x80x9d or xe2x80x9cHOSxe2x80x9d) is installed on the hardware. The computer is operationally divided into a system level and a user level and the computer accepts and carries out a pre-determined set of privileged instruction calls only from sub-systems at the system level. The invention provides at least one virtual machine monitor (VMM) that is installed to be co-resident with the host operating system at the system level.
In the preferred embodiment of the invention, a driver is downloaded into and is resident in the host operating system. A host operating system (HOS) context is then saved in the driver. A corresponding virtual machine monitor (VMM) context is saved in the virtual machine monitor. Switching from the HOS context to the VMM context is then carried out in the driver, whereas switching from the VMM context to the HOS context is done in the virtual machine monitor. The driver issues, in the HOS context, commands previously specified by the VMM.
A predetermined physical address space is assigned to the system memory, and virtual addresses are converted into physical addresses of the memory""s physical address space in a memory management unit (MMU). HOS context parameters are stored in a first virtual address space in the host operating system and VMM context parameters are stored in the virtual machine monitor in a second address space, which is disjoint from the first address space. The HOS and VMM context parameters also represent different interrupt and exception entry points for the host operating system and the virtual machine monitor, respectively. Switching between the HOS and VMM contexts is then carried out by selectively setting internal registers of the host processor to correspond to the HOS and VMM context parameters, respectively.
As a further feature of the invention, a device emulator is preferably installed to be operatively connected to the host operating system. The device emulator accepts commands stored in memory by the VMM via the driver and processing these commands. The emulator also issues host operating system calls and thereby accessing the physical system devices via the host operating system.
At least one virtual machine is preferably installed at user level. The virtual machine includes at least one associated application and a virtual operating system, and is operatively connected to the virtual machine monitor. An entire programmable virtual address space of the computer is thereby made available to both the virtual machine and the virtual machine monitor.
The emulator is preferably operatively connected as an application to the host operating system. In a preferred embodiment of the invention, the driver passes first instruction calls from the virtual machine monitor directly to the host operating system, the system devices, and the processor, and passes second instruction calls from the virtual machine monitor, via the emulator, to the host operating system. The driver also passes data to the virtual machine monitor in response to the first and second instruction calls received both via the emulator and directly from the system devices, the processor and the host operating system. The second instruction calls are thereby executed directly in the host operating system.
In the preferred embodiment of the invention, the commands processed in the driver in the HOS context are processed as remote procedure calls. Similarly, the commands processed in the device emulator, via the driver, are also preferably processed as remote procedure calls.