This disclosure relates generally to the field of computer programming. More particularly, but not by way of limitation, it relates to techniques for programming graphics and general-purpose parallel computational applications that can be compiled into a common intermediate representation that can be further compiled to execute on a variety of graphical and computational processors.
Computers and other computational devices typically have at least one programmable processing element that is generally known as a central processing unit (CPU). They frequently also have other programmable processors that are used for specialized processing of various types, such as graphics processing operations, hence are typically called graphics processing units (GPUs). GPUs generally comprise multiple cores or processing elements designed for executing the same instruction on parallel data streams, making them more effective than general-purpose CPUs for algorithms in which processing of large blocks of data is done in parallel. In general, a CPU functions as the host and hands-off specialized parallel tasks to the GPUs.
Although GPUs were originally developed for rendering graphics and remain heavily used for that purpose, current GPUs support a programming paradigm that allows using the GPUs as general-purpose parallel processing units in addition to being used as graphics processors. This paradigm allows implementation of algorithms unrelated to rendering graphics by giving access to GPU computing hardware in a more generic, non-graphics-oriented way.
Several frameworks have been developed for heterogeneous computing platforms that have CPUs and GPUs. These frameworks include the Metal framework from Apple Inc., although other frameworks are in use in the industry. Some frameworks focus on using the GPU for general computing tasks, allowing any application to use the GPUs parallel processing functionality for more than graphics applications. Other frameworks focus on using the GPU for graphics processing and provides APIs for rendering two-dimensional (2D) and three-dimensional (3D) graphics. The Metal framework supports GPU-accelerated advanced 3D graphics rendering and data-parallel computation workloads.
The Metal and other frameworks offer a C-like development environment in which users can create applications to run on various different types of CPU s, GPU s, digital signal processors (DSPs), and other processors. Some frameworks also provide a compiler and a runtime environment in which code can be compiled and executed within a heterogeneous computing system. When using some frameworks, developers can use a single, unified language to target all of the processors currently in use. This is done by presenting the developer with an abstract platform model and application programming interface (API) that conceptualizes all of these architectures in a similar way, as well as an execution model supporting data and task parallelism across heterogeneous architectures. Metal has a corresponding shading language to describe both graphics shader and compute functions, which can be compiled during build time and then loaded at runtime. Metal also supports runtime compilation of Metal shading language code.
Tasks may be offloaded from a host (e.g., CPU) to any available GPU in the computer system. Using Metal or other frameworks, programmers can write programs that will run on any GPU for which a vendor has provided corresponding framework-specific drivers. When a Metal or other framework program is executed, a series of API calls configure the system for execution, an embedded compiler compiles the Metal or other framework code, and the runtime asynchronously coordinates execution between parallel kernels and shaders. Applications may use functionality of multiple frameworks, sharing data between the framework-specific portions of the application. Because the developer may not know the actual CPU or GPU that will execute the application, the developer cannot compile the source code into a pure binary for the end-user device.
A typical framework-based system takes source code and run it through an embedded compiler on the end-user system to generate executable code for a target GPU available on that system. Then, the executable code, or portions of the executable code, are sent to the target GPU and are executed. However, developers would prefer not to have to ship their shaders and kernels as source code, but compile their shaders and kernels offline. In addition, each source code language requires an embedded compiler on the end-user device that can compile that source code language. Therefore, there is a need in the art for an approach for providing software to a source code language-independent runtime environment without exposing the source code used to generate the shader or kernel code.