1. Field of the Invention
The present invention relates generally to heterogeneous computer systems.
2. Background Art
Computers and other such data processing devices have at least one control processor that is generally known as a central processing unit (CPU). Such computers and processing devices can also have other processors such as graphics processing units (GPU) that are used for specialized processing of various types. For example, GPUs are designed to be particularly suited for graphics processing operations. GPUs generally comprise multiple processing elements that are ideally suited for executing the same instruction in parallel on different data streams, such as in data-parallel processing. A GPU can comprise, for example, a graphics processor unit, a graphics processor, a graphics processing core, a graphics processing device, or the like. In general, a CPU functions as the host or controlling processor and hands-off specialized functions such as graphics processing to other processors such as GPUs.
With the availability of multi-core CPUs where each CPU has multiple processing cores, substantial processing capabilities that can also be used for specialized functions are available in CPUs. One or more of the computation cores of multi-core CPUs and GPUs can be part of the same die (e.g., AMD Fusion™) or on different dies (e.g., Intel Xeon™ with NVIDIA GPU). Recently, programming systems (e.g., OpenCL) have been introduced for General Purpose GPU (GPGPU) style computing to execute non-graphics applications on GPUs. The GPGPU style of computing advocates using the CPU to primarily execute control code and to offload performance critical data-parallel code to the GPU. The GPU is primarily used as an accelerator. However, some GPGPU programming systems allow the use of both CPU cores and GPU cores as accelerator targets.
Several frameworks have been developed for heterogeneous computing platforms that have CPUs and GPUs. These frameworks include BrookGPU by Stanford University, CUDA by NVIDIA, and OpenCL by an industry consortium named Khronos Group. The OpenCL framework offers a C-like development environment using which users can create applications for GPU. OpenCL enables the user, for example, to specify instructions for offloading some computations, such as data-parallel computations, to a GPU. OpenCL also provides a compiler and a runtime environment in which code can be compiled and executed within a heterogeneous computing system.
Frameworks such as OpenCL, at present, require the programmer to determine what parts of the application(s) are executed on the CPUs and the GPUs of the heterogeneous system. Determining this split, however, is not trivial as GPUs and CPUs spanning a broad spectrum of performance characteristics are available on the market and can be mixed and matched in a given system. In addition, the available resources on the system at runtime may vary depending on other applications executing on the same system. Therefore, application programmers are faced with implementing elaborate, complex, dynamic schemes for allocating multiple kernels to processors within their applications or settling for sub-optimal performance.