1. Field of the Invention
Embodiments of the present invention relate generally to compiler programs and more specifically to an application program that is written for execution by a multi-core graphics processor and retargeted for execution by a general purpose processor with shared memory.
2. Description of the Related Art
Modern graphics processing systems typically include a multi-core graphics processing unit (GPU) configured to execute applications in a multi-threaded manner. The graphics processing systems also include memory with portions that shared between the execution threads and dedicated to each thread.
NVIDIA's CUDA™ (Compute Unified Device Architecture) technology provides a C language environment that enables programmers and developers to write software applications to solve complex computational problems such as video and audio encoding, modeling for oil and gas exploration, and medical imaging. The applications are configured for parallel execution by a multi-core GPU and typically rely on specific features of the multi-core GPU. Since the same specific features are not available in a general purpose central processing unit (CPU), a software application written using CUDA may not be portable to run on a general purpose CPU.
As the foregoing illustrates, what is needed in the art is a technique for enabling application programs written using a parallel programming model for execution on multi-core GPUs to run on general purpose CPUs without requiring the programmer to modify the application program.