Rapid advances have recently taken place in graphics processor unit (GPU) virtualization. Virtualized graphics processing environments are used, for example, in the media cloud, remote workstations/desktops, Interchangeable Virtual Instrumentation (IVI), rich client virtualization, to name a few. Certain architectures perform full GPU virtualization through trap-and-emulation to emulate a full-featured virtual GPU (vGPU) while still providing near-to-native performance by passing through performance-critical graphics memory resources.
With the increasing importance of GPUs in servers to support 3D, media and GPGPU workloads, GPU virtualization is becoming more widespread. How to virtualize GPU memory access from a virtual machine (VM) is one of the key design factors. The GPU has its own graphics memory: either dedicated video memory or shared system memory. When system memory is used for graphics, guest physical addresses (GPAs) need to be translated to host physical addresses (HPAs) before being accessed by hardware.
There are various approaches for performing translation for GPUs. Some implementations perform translation with hardware support, but the GPU can be passed-through to one VM only. Another solution is a software approach which constructs shadow structures for the translation. For instance, shadow page tables are implemented in some architectures such as the full GPU virtualization solution mentioned above, which can support multiple VMs to share a physical GPU.
In some implementations, the guest/VM memory pages are backed by host memory pages. A virtual machine monitor (VMM) (sometimes called a “Hypervisor”) uses extended page tables (EPT), for example, to map from a guest physical address (PA) to a host PA. Many memory sharing technologies may be used, such as Kernel Same page Merging (KSM).
KSM combines pages from multiple VMs with the same content, to a single page with write protection. That is to say, if a memory page in VM1 (mapping from guest PA1 to host PA1), has the same contents as another memory page in VM2 (mapping from guest PA2 to host PA2), may use only one host page (say HPA_SH) to back the guest memory. That is, both guest PA1 of VM1 and PA2 of VM2 are mapped to HPA_SH with write protection. This saves the memory used for the system and is particularly useful for read-only memory pages of the guest such as code pages, and zero pages. With KSM, copy-on-write (COW) technology is used to remove the sharing once a VM modifies the page content.
Mediate pass through is used in virtualization systems for device performance and sharing, where a single physical GPU is presented as multiple virtual GPU to multiple guests with direct DMA, while the privileges resource accesses from guests are still trap-and-emulated. In some implementations, each guest can run the native GPU driver, and device DMA goes directly to memory without hypervisor intervention.