1. Field of the Invention
The present invention generally relates to computer graphics and more particularly to a method and system for connecting multiple shaders.
2. Description of the Related Art
Unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Over the past decade, graphics hardware has gone from a simple memory device to a configurable device and relatively recently, to a programmable graphics processing unit (GPU). To fully realize the parallel processing capabilities of a GPU, as much GPU functionality as possible needs to be programmable and be exposed to developers. Among other things, doing so enables developers to tailor their shader programs to optimize the way a GPU processes graphics scenes and images. In a prior art approach, a GPU includes a series of processing units, each is configured to carry out a different and often dedicated function of a graphics pipeline, where the output of one processing unit is the input to the next processing unit in the chain. Some of these processing units in the graphics pipeline are programmable, such as a vertex processing unit and a fragment processing unit, but other processing units perform fixed functions, such as a primitive assembler, a geometry processor, and a rasterizer.
The aforementioned prior art approach has some shortcomings. First, without full programmability, the graphics pipeline is unable to efficiently respond to changes in Application Programming Interface (API), such as OpenGL and DirectX, or address any bugs identified to be associated with the pipeline. Second, because many functions of the graphics pipeline and the sequence of performing such functions are fixed, a graphics application utilizing the graphics pipeline does not have the full flexibility to maneuver various shader programs, such as invoking shader programs in a different sequence than the sequence of the pipeline stages (e.g., invoking a geometry shader ahead of a vertex shader) or repeating a particular shader program multiple times (e.g., invoking a vertex shader six times). Even with workaround approaches capable of emulating the maneuvering of various shader programs on the prior art system, these approaches are cumbersome to implement and are inefficient to operate. For example, one workaround approach is to configure a graphics pipeline to execute a particular shader program, stream the output of the shader program into a frame buffer, reconfigure the graphics pipeline to execute another shader program, re-inject the stored data from the frame buffer back to the reconfigured pipeline for processing, and repeat these steps until all the shader programs are processed in a specific sequence. The repeated configurations of the graphics pipeline and the accesses of the frame buffer consumes significant processing and memory resources and introduces undesirable delays. Another workaround approach involves merging the multiple shader programs and recompiling the merged program to generate a single all-encompassing shader program for the graphics pipeline to process. However, this approach is inefficient, because if any of the shader programs or the sequence of executing the shader programs needs to be altered, then these extra steps of merging and compiling also need to be repeated.
Lastly, the prior art approach does not support a mechanism that reconciles different input and output requirements of multiple shader programs. To illustrate, suppose a first shader program to be executed by a prior art GPU requests for 40 outputs but a second shader program, coupled to the first shader program, only requests for 6 inputs. In other words, the second shader program is designed to read only 6 of the 40 outputs from the first shader program. Without considering the requirements of the second shader program, the GPU still allocates the resources for all 40 outputs for the first shader program.
As the foregoing illustrates, what is needed in the art is a method and system for supporting a user-configurable graphics pipeline capable of efficiently managing storage of inputs and outputs between shader programs.