Various types of special-purpose processors, such as graphics processing units (GPUs) for general purpose computing, have been developed to accelerate the processing of specific types of workloads. Architecturally, a GPU has a massively parallel architecture which typically comprises hundreds or thousands of cores that are configured to concurrently execute hundreds or thousands of threads at a given time. This is in contrast to a standard central processing unit (CPU) architecture which typically comprises a few cores and associated cache memory, which are optimized for sequential serial processing and handling a few software threads at a given time.
The processing capabilities of GPU resources are currently being utilized in various applications to accelerate the processing of highly-parallelized computational workloads in various technical fields. In particular, general-purpose computing on GPU (GPGPU) is utilized for high-throughput, accelerated processing of compute kernels for workloads (e.g., vector-based computations, matrix-based computations, etc.) that exhibit data-parallelism. For example, GPUs are used to accelerate data processing in high-performance computing (HPC) and embedded computing systems, for various applications such as financial modeling, scientific research, machine learning, data mining, video data transcoding, image analysis, image recognition, virus pattern matching, augmented reality, encryption/decryption, weather forecasting, big data comparisons, and other applications with computational workloads that have an inherently parallel nature. Due to the high-throughput and low energy consumption per operation exhibited by GPUs, it is anticipated that GPU-as-a-Service (GPUaaS) will become mainstream in the near future, wherein cloud-based systems will implement GPU powered blades for various types of processing.