1. Field of the Invention
The present invention relates generally to Virtual Machine (VM) technology and, more particularly, to methods and systems for emulating the memory segment model of a microprocessor.
2. Background Art
With VM technology, a user can create and run multiple operating environments on a computer at the same time. Each operating environment, or Virtual Machine, requires its own operating system (OS) and can run applications independently. The VM software provides a layer between the hardware of the computing system and the software that runs on it.
Frequently, the problem arises of simultaneously running different operating systems on the same hardware system. For example, with one version of MICROSOFT WINDOWS running on the computing system, it can be necessary to start another instance or another version of WINDOWS or another operating system on the same hardware system.
A typical Virtual Machine Monitor (VMM) enables a single physical machine or processor to act as if it were several physical machines. A VMM, typically jointly with a high-ranking OS (although there are VMMs that can be executed on bare hardware, without a high-ranking OS), can run a number of different operating systems simultaneously, such that each of the different operating systems has its own VM. In other words, a typical VMM can handle a number of VMs, each of which represents its own OS, and each of which can run its own application software and control or use its own hardware, including certain types of processors, I/O and data storage devices, and so on, as if they were running on a single processor. The high-ranking OS is typically referred to as a “host OS” (HOS). The multiple operating systems that are running as VMs are typically referred to as “guest operating systems” (“guest OSs”) running “guest code.”
A conventional approach for implementing VMs includes a VMM approach developed by IBM and implemented on mainframes, which support virtualization. Another approach includes implementing VMM on modem processors, which do not support the hardware virtualization, such as full step-by-step or page-by-page interpretation of the original code, or full binary translation of the original code, or combining binary translation of some portions of the original code and direct execution of other portions of the original code.
One of the common problems in Virtual Machine technology is the overhead that results from processing of privileged but unsafe instructions, also known as processing of “code under control.” In particular, in the context of some Virtual Machine implementations, such as, for example, described in U.S. patent application Ser. No. 11/139,787, entitled METHODS AND SYSTEMS FOR SAFE EXECUTION OF GUEST CODE IN VIRTUAL MACHINE CONTEXT, filed on May 31, 2005, which is incorporated herein by reference in its entirety, the high “cost” of the exceptions needed to handle the privileged but unsafe instructions is of particular concern. In other words, triggering, or raising, the exceptions is one of the major sources of overhead.
Memory management in modern microprocessors is a fairly complex subject. Many modern microprocessors have a large address space, for example, an address space defined by 32 bit addresses (i.e., a 232, or four gigabytes, address space), or 64 bit addresses (which corresponds to 264 possible addresses). Most practical computers do not have that much physical memory. Typically, only a fraction of the total address space that is theoretically possible is actually physically available in a particular computer. Therefore, complex schemes need to be implemented to ensure that an address specified in the instruction being executed is directed to an address that actually physically exists. To solve this problem, various translation mechanisms exist to convert specified nominal (linear) addresses to actual physical addresses, and to ensure that units of memory (pages, segments, etc.) are swapped in and out of physical memory, as appropriate.
In the context of Virtual Machine technology, the issue arises in ensuring that the emulation correctly takes place, and that the emulation is transparent to the guest code of the Virtual Machine, notwithstanding the need to reconcile the architectural and segment issues of memory addressing. To achieve this, it is desirable to utilize the various useful capabilities of the processor's segment model in the virtualization of the processor.
The second issue that a Virtual Machine designer needs to address is how to utilize the segment architecture of the processor for both the implementation of the Virtual Machine itself, and for various tasks that the Virtual Machine Monitor may need to perform.
In the INTEL architecture, any memory access requires segment translation. To execute an instruction, the CPU uses code segment translation. To access data, CPU uses data segment translation. Any instruction utilizes at least one segment register to address the memory (code segment). For example, in the INTEL processor, CS is the code segment register to execute instructions, SS is the stack segment register to storing data in by default, the stack, DS is the data segment register, and ES, FS, GS are segment registers for other data access. For example, the processor checks the code segment register (CS) prior to execution of any code. Code (instructions) can be grouped by different code segments, and for each segment there is a corresponding descriptor value in the CS register (which defines the start (base) address of the segment, the privilege level, and the limit, or size, of the segment, such that the entire segment in memory is linear and is uninterrupted in terms of its addresses). Similarly, the data segment registers (SS, DS, ES, FS, GS) is usually used to access data in memory. Other items that comprise descriptors include segment granularity, Present—Not Present bit, descriptor privilege level (DPL), type of the descriptor, etc.
Global and Local Descriptor Tables (GDT and LDT) are tables in the physical memory that store segment descriptors for each segment. The value in the segment register is usually referred to as a “selector” and points to the entry in the GDT.
The challenge, therefore, is to ensure that the behavior of the virtualized code matches what it would have been, were it not virtualized, given attempts by the guest code to manipulate the segment registers, and the GDT and LDT. Another issue that arises is how to efficiently virtualize the processor's segment model for the purposes of both the VM and the VMM.
Accordingly, what is needed are methods and systems for efficient emulation of the segment model when running a Virtual Machine.