1. Field of the Invention
One or more aspects of the present invention relate generally to instruction sets, and more particularly to a unified instruction set for vertex, fragment, or geometry programs.
2. Description of the Related Art
Over the past decade, the cost of adding on-chip logic to processors has substantially decreased. Consequently, certain types of processors, such as advanced graphics processing units (GPUs), now include functionality not previously available in earlier GPU designs. For example, the newest GPUs are now able to perform geometry processing operations; whereas, such operations traditionally had been left to the central processing unit (CPU). One benefit of this shift in responsibilities is that more graphics processing may now be performed on the GPU instead of the CPU, thereby reducing performance bottlenecks in the graphics pipeline.
To fully realize additional processing capabilities of advanced GPUs, as much GPU functionality as possible needs to be exposed to graphics application developers. Among other things, doing so enables graphics application developers to tailor their shader programs to optimize the way GPUs process graphics scenes and images. Exposing new GPU processing capabilities, like geometry processing, to graphics application developers requires that the application programming interface (API) be configured with new calls and libraries that make new features and functionalities directly accessible by developers.
Some graphics APIs expose an interface to graphics application developers that consists of a set of calls written in a high-level programming language. To access the API, graphics application developers have to write their shader programs in the same high-level programming language or have their program code translated into that same high-level programming language. One drawback of this approach is that the shader programs written or translated into the high-level programming language of the API must first be compiled within the API layer into microcode that can then be executed on the GPU. Compiling shader programs is typically performed by the CPU while the application is running. The processing overhead required for the compilation can reduce the application's frame rate. When the compilation is performed off-line, the shader program is compiled to produce microcode for a specific GPU, limiting a user's ability to use another GPU for execution of the microcode. Another drawback is that the set of calls to which graphics application developers have access may not reflect the full functionality of the GPU. In a sense, developers are held hostage to the whims of the API architect. For example, if the API architect chooses not to write an API call that exposes one of the salient features of the GPU to the graphics application developer, then the developer has no way to access that GPU feature.
FIG. 1 is a conceptual diagram illustrating the relationships between instruction set architectures, shader programs, microcode assemblers, and processing units in a prior art system. A conventional graphics processor 150 includes a vertex processing unit 155 and a fragment processing unit 160. The vertex processing unit 155 is configured to execute compiled vertex shader programs and the fragment processing unit 160 is configured to execute compiled fragment shader programs. A vertex shader program 115 is constructed using program instructions from a vertex instruction set architecture (ISA) 105. Likewise, a fragment shader program 120 is constructed using program instructions from a fragment ISA 110.
Program instructions included in fragment ISA 110 are designed for execution in the fragment domain and generally may not be executed in the vertex domain. Likewise, program instructions included in vertex ISA 105 are intended for execution in the vertex domain and generally may not be executed in the fragment domain. Due to these differences between fragment ISA 110 and vertex ISA 105, application developers can not be assured that code developed using an ISA for one processing domain can be used without substantial modifications for a different processing domain. Therefore, dedicated microcode assemblers are used to translate the shader programs for each domain. Specifically, a GPU vertex microcode assembler 125 compiles vertex shader program 115 into microcode for execution by vertex processing unit 155. Similarly, a GPU fragment microcode assembler 130 compiles fragment shader program 120 into microcode for execution by fragment processing unit 160. Upon execution of the microcode, graphics processor 150 outputs processed graphics data 170.
As the processing capabilities of graphics processor 150 evolve, instructions are added in vertex ISA 105 and fragment ISA 110 as needed to expose the new processing capabilities. Processing capabilities that are available for both vertex and fragment shaders must be added to both vertex ISA 105 and fragment ISA 110. Additionally, both compilers, GPU vertex microcode assembler and GPU fragment microcode assembler 130, are updated to translate any new instructions into microcode.
As the foregoing illustrates, what is needed in the art is an application programming interface that exposes new processing capabilities of GPUs, while requiring minimal changes to the programming architecture.