As GPUs continue to evolve into high performance parallel computing devices, more and more applications are written to perform data parallel computations in GPUs similar to general purpose computing devices. Today, these applications are designed to run on specific GPUs using vendor specific interfaces. Thus, these applications are not able to leverage processing resources of CPUs even when both GPUs and CPUs are available in a data processing system. Nor can processing resources be leveraged across GPUs from different vendors where such an application is running.
However, as more and more CPUs embrace multiple cores to perform data parallel computations, more and more processing tasks can be supported by either CPUs and/or GPUs whichever are available. Traditionally, GPUs and CPUs are configured through separate programming environments that are not compatible with each other. Most GPUs require dedicated programs that are vendor specific. As a result, it is very difficult for an application to leverage processing resources of both CPUs and GPUs, for example, leveraging processing resources of GPUs with data parallel computing capabilities together with multi-core CPUs.
In addition, CPUs and GPUs use separate memory address spaces. The memory buffer needs to be allocated and copied in GPU memory for the GPU to process data. If an application wants the CPU and one or more GPUs to operate on regions of a data buffer, the application needs to manage allocation and copying of data from appropriate regions of the buffer that is to be shared between CPU and GPU or across GPUs. Therefore, there is a need in modern data processing systems to have a heterogeneous mix of CPUs and GPUs sharing a buffer.