1. Field of the Invention
Embodiments of the present invention relate generally to computer graphics and more specifically to a system and method for representing and managing a multi-architecture co-processor application program.
2. Description of the Related Art
Modern computer systems typically include a central processing unit (CPU) and at least one co-processing unit, such as a graphics processing unit (GPU). The CPU executes instructions associated with software modules, including, without limitation, an operating system and drivers that control and manage the operation of the GPU. The CPU and GPU may cooperatively execute a co-processor enabled application program, which includes a first set of instructions executed by the CPU and a second set of instructions executed by the GPU.
Early generations of GPU architectures provide limited programmability, which is predominately directed to executing functions for graphics shading and rendering. Source code for these functions is conventionally stored and managed by the first set of instructions executing on the CPU, and associated with the co-processor enabled application program. The co-processor enabled application program submits the source code to a GPU driver executing on the CPU that is configured to compile and link the source code into GPU-specific program fragments for execution on an attached GPU, using a just-in-time (JIT) regime. Because the GPU driver targets the currently attached GPU in each new compilation of the source code, new GPU architectures are usually accommodated by a new GPU driver that is developed and distributed in conjunction with the new GPU devices. Importantly, this JIT compilation strategy enables existing application programs to operate transparently with new GPU devices, thereby preserving the investment of both the co-processor enabled application program developer and the co-processor enabled application program customer.
Recent generations of GPU devices have increased computational throughput, programmability and storage capacity relative to previous generations of devices. With these increased capabilities, CPUs are being used to execute substantially larger, more complex functions within co-processor enabled application programs. These larger functions frequently require long compile times that are inappropriate for JIT compilation. With a long compile time, for example, users may experience an unacceptably protracted start-up time when launching a co-processor application program.
One approach to avoid long compilation times is to incorporate pre-compiled GPU machine code within the co-processor enabled application program. In this approach, pre-compiled GPU code fragments may be incorporated into the application program as a code bundle representing every known GPU at compile time. However, as new GPU generations become available, such a pre-compiled code bundle is likely to encounter new GPU devices and underlying architectures that were not anticipated at compile time. Thus, this approach does not provide forward compatibility for co-processor enabled application programs and, more importantly, does not preserve developer or customer investment in these types of applications.
As the foregoing illustrates, what is needed in the art is a technique for providing fast application program start-up as well as forward GPU compatibility for co-processor enabled application programs.