1. Field of the Invention
The present invention relates generally to the field of computer processing and more specifically to a system and method for enabling interoperability between application programming interfaces (APIs).
2. Description of the Related Art
A typical computing system includes a host, such as a central processing unit (CPU), and a compute device, such as a graphics processing unit (GPU). Some compute devices are capable of very high performance using a relatively large number of small, parallel execution threads on dedicated programmable hardware processing units. The specialized design of such compute devices allows these compute devices to perform certain tasks, such as rendering 3-D scenes and tessellation, much faster than a host. However, the specialized design of these compute devices also limits the types of tasks that the compute devices can perform. The host is typically a more general-purpose processing unit and therefore can perform most tasks. Consequently, the host usually executes the overall structure of software application programs and configures the compute device to perform specific data-parallel, compute-intensive tasks.
To fully realize the processing capabilities of advanced compute devices, compute device functionality may be exposed to application developers through one or more application programming interfaces (APIs) of calls and libraries. Among other things, doing so enables application developers to tailor their application programs to optimize the way compute devices function. Typically, each API is designed to expose a particular set of hardware features, and is suitable for a specific set of problems. For example, in some compute devices, a graphics API enables application developers to tailor their application programs to optimize the way those compute devices process graphics scenes and images. Similarly, in some compute devices, a compute API enables application developers to tailor their application programs to optimize the way those compute devices execute high arithmetic intensity operations on many data elements in parallel. Some application programs include algorithms that are most efficiently implemented by using a graphics API to perform some tasks and a computation API to perform other tasks.
In one approach to developing such an application program, the application developer implements a computation algorithm using the compute API and implements subsequent graphics operations that utilize the output of the computation algorithm using the graphics API. To allow the graphics API to consume the data written via the compute API, the application developer copies the data from the memory associated with the compute API to the host memory. The application developer then submits this data via the graphics API, thereby copying the data from the system memory into graphics objects associated with the graphics API. One drawback to this approach is that the application program allocates three buffers and makes two copies of the data that is accessed by both the compute API and the graphics API. Allocating and copying buffers in this fashion may reduce the speed with which the host and compute device execute the application program and, consequently, may hinder overall system performance.
As the foregoing illustrates, what is needed in the art is a more efficient and flexible technique for enabling APIs to inter-operate.