Visually intensive computer graphics applications such as 3D (three-dimensional) computer games, flight simulators and other 3D imaging applications may involve user interaction, scene management and rendering, physics modeling, artificial intelligence and other relatively complex functions. While certain game applications may leverage the capabilities of a local GPU (graphics processing unit) by offloading graphical and non-graphical computation to the GPU in order to maintain interactive frame rates, there remains considerable room for improvement.
For example, conventional approaches may send 3D workloads to a 3D pipeline of the GPU, wherein the 3D workloads perform functions, such as, for example, rendering 3D images and scenes using processing functions that act upon 3D primitive shapes (e.g., rectangle, triangle, etc.). Other computational workloads, however, may be sent as a general purpose GPU (GPGPU) workload (e.g., a compute kernel) to the GPU, wherein the compute kernel may facilitate native execution of the GPGPU workload on processing units of the GPU. Native execution of the GPGPU workload may result in relatively low efficiency from both a power and performance perspective.
More particularly, efficiency may be negatively impacted by overhead associated with switching between 3D and GPGPU pipelines, hardware restrictions such as memory hierarchies supporting the workload execution, cache flushes, execution lane (e.g., single instruction multiple data/SIMD) restrictions on resource accesses (e.g., typed/un-typed unordered access views/UAV, shader resource views, raw access, etc.), and so forth.