Virtual computing allows multiple virtual machines, each having their own operating system, to run on a host computer. The host computer has a virtualizer program that can virtualize the hardware resources of the host machine for virtual machine use. The virtual machine that is requesting hardware resources such as CPU, memory, I/O and disk space is called a guest with respect to the host computer. In a virtual machine, the guest computer system may only exists in the host computer system as a pure software representation of the operation of one specific hardware architecture. In some instances, a virtual machine may have direct access to specific hardware such as an IO device. A virtualizer program executing on the operating system software and hardware architecture of the host computer mimics the operation of the appropriate portions of the guest computer system.
When a guest accesses virtual hardware, the virtualizer program intercedes and provides emulation or translation of the virtualized device. In one virtual machine environment embodiment, the emulated environment may include a virtual machine monitor (VMM) which is a software layer that runs directly above the host hardware, perhaps running side-by-side and working in conjunction with the host operating system, and which can virtualize all the resources of the host machine (as well as certain virtual resources) by exposing interfaces that are the same as the hardware the VMM is virtualizing In a virtual machine environment, the multiple virtual machines impose functional requirements on the hardware resources of the host machine. For example, it is desirable to keep one virtual machine separated from the other virtual machines as well as separated from the host. Separation or isolation of one virtual machine from another is useful to segregate errors and faults such that one virtual machine fault does not affect another virtual machine.
In one configuration, a guest virtual machine may attempt to access graphics that is virtualized. Typical graphics support in the host computer includes a graphics processor unit (GPU) that has its own memory separate from the CPU or main memory. Often, the GPU is connected to the host CPU via an interface, such as PCI, that interconnects the CPU board and the separate graphics processor board.
In many existing systems, the GPU accesses memory through a graphics address resolution table (GART). The GART allows the GPU to access disjoint physical memory as if it were linearly contiguous. This mechanism provides to the GPU a mechanism that is similar to the address translation that is provided to the CPU by the memory management unit (MMU). Implementation of GART only provides for one set of translation tables so there is but a single graphics context that is shared among all applications that use graphics. A user mode application would submit its graphics commands to a kernel mode driver which would then translate the addresses used in the commands from the user mode space, to the single GPU context defined by the GART. Newer systems will provide a more flexible scheme of addressing which allows multiple graphics contexts to be defined. A user mode application would be assigned a graphics context. The application would create graphics commands within that context and submit them to the GPU for execution. When the GPU executed the commands associated with a specific context, it will insure that it is using the address translation table that is associated with that user mode application. This avoids the time consuming step of command editing by the kernel-mode graphics driver.
The multi-context, translation hardware (GMMU) that is used by the GPU is normally included as part of the graphics subsystem and programming of the GMMU is normally a function of the kernel-mode graphics driver. In systems that have multiple virtual machines, it should not be possible for the kernel-mode graphics driver in one virtual machine to have unrestricted access to memory in other virtual machines which it would have if the GMMU were able to reference main system memory directly. To prevent unlimited access to system memory by graphics (or any other IO device) systems include hardware that will filter accesses to main system memory by IO devices. This filtering is performed by the IO memory management unit (IOMMU) and may be as simple as a one-bit access check, or as complex as an address translation. In the case where the IOMMU is doing a full address translation, there is the possibility that a GPU access will first be translated by the GMMU and translated again by the IOMMU before being able to access main system memory.
The recent move to more integrated host CPU chipsets which incorporate the graphics card and has no memory dedicated to the GPU. In this case, the tables used by the GMMU are in main system memory. If the approach to graphics were not changed, then the accesses to main system memory by the GMMU in order to fetch entries in the graphics translation table would have to be translated by the IOMMU and then the resulting address computed by the GMMU would have to be translated again by the IOMMU. If each GMMU assess required two accesses (a two-level table) and each IOMMU access required three accesses, this would result in a total of 9 memory cycles to resolve and address and access the graphics data. This is much too much overhead for the high-performance demands of graphics.
Thus, there is a need for an improved method of providing address translations for graphics processes in an integrated graphics chipset environment. Preferably, this improvement may be applicable to virtual machine environments as well as non-virtual machine environments.