A computing system (e.g., a server) may include a Graphics Processing Unit (GPU). A hypervisor on the computing system may consolidate Virtual Machines (VMs) on a computing platform of the computing system including the GPU. The VMs may share resources associated with the computing platform. The GPU may be a Peripheral Component Interconnect Express (PCIe)-based device that supports Single Root Input/Output virtualization (SR-IOV). SR-IOV may be designed to deliver interrupts generated in the computing platform to multiple operating system driver stacks. However, the aforementioned delivery may be associated with high implementation costs and/or complexity for devices such as GPUs where high performance is tied to an application state being closely coupled to hardware.
In the case of a non-SR-IOV based GPU, a single driver stack may execute on the hypervisor, and VMs may be multiplexed on top of the single driver stack. This may allow for interrupt delivery to the hypervisor; however, performance may be reduced because applications executing in the VMs are no longer closely coupled to the GPU hardware.
Performance may be improved by executing a GPU driver stack in each VM; however, for non SR-IOV based GPU hardware, multiplexing hardware for interrupt delivery may prove to be a challenge. One approach to address the challenge may be to service GPU interrupts in the hypervisor, and, in turn, steer virtual interrupts generated at the hypervisor to the VMs. However, without contextual information normally held within the GPU driver stack, it may be impossible for the hypervisor to actually steer virtual interrupts to the VMs.