1. Field of the Invention
Embodiments of the present invention generally relate to computer programming using graphics hardware. More specifically, embodiments of the invention relate to techniques for feeding back and recording transformed vertices in a graphics library.
2. Description of the Related Art
Graphics processing units (GPUs) are configured to execute instructions that generate images that may then be displayed on a display device. GPUs typically implement a pipelined architecture, and the different processing units within the pipeline execute shader language programs on the streams of graphics data as that data passes through the different parts of the graphics pipeline. For example, common shader language programs include vertex and fragment programs, and new geometry programs have been added recently. The vertex processing unit executes a vertex shader program on data passing through the vertex shader portion of the graphics pipeline, the geometry processing unit executes a geometry shader program on data passing through the geometry shader portion of the graphics pipeline, and the fragment processing unit executes a fragment shader program on data passing through the fragment shader portion of the graphics pipeline. Generally, the output of the vertex shader is the input to the geometry shader, and the output of the geometry shader is the input to the fragment shader.
Over the past decade, the cost of adding on-chip logic to processors has substantially decreased. Consequently, certain types of processors, such as advanced graphics processing units (GPUs), now include functionality not previously available in earlier GPU designs. For example, the newest CPUs are now able to perform geometry processing operations; whereas, such operations traditionally had been left to the central processing unit (CPU).
Given the increased computing power available on advanced CPUs, graphics developers are using the graphics pipeline for more than just generating images for display on a display screen. For example, a vertex shader program operating on a set of input vertices may take each vertex, process it, and output a transformed set of vertices. Such a shader program could perform physics calculations to determine positions for a set of vertices at an initial point in time and output a subsequent position for each vertex at a second point in time. Another example is in the field of molecular modeling. In such cases, a shader program may calculate a future position and a net charge of each atom in a molecule, based on a current position and charge of each atom. By repeating such a calculation millions of times, the shader program can calculate a theoretical steady-state configuration of the overall molecule. In such applications, the graphics pipeline is not limited to the conventional process of computing or determining color and intensity values for each pixel of a display.
However, getting transformed vertex data out of the graphics rendering pipeline has proven to be somewhat difficult. One approach is to write the transformed data from the frame buffer to a buffer object once the data has been processed by the graphics pipeline and written to the frame buffer. Since the fragment shader is typically the only part of the graphics pipeline able to compute values written to the frame buffer, this approach requires that the transformed vertex data be passed completely through the pipeline, even though no downstream processing may be performed on the transformed vertex data. Moreover, the fragment shader is typically configured to write pixel data to the frame buffer. Therefore, if other data representations are desired, such as an array of vertex attributes, then the application developer has to map the transformed vertex data output from either the vertex shader or the geometry shader into a pixel format so that the data can then be written to the frame buffer. Once the transformed graphics data has been written into the frame buffer, then the data has to be mapped back from a pixel formal to the desired format (i.e., as an array of vertex attributes). Thus, this approach may require substantial overhead, especially if the pixel data needs to be passed back to the CPU to be reformatted into a vertex data representation, which is often the case.
Another approach is to configure the graphics API to allow a graphics developer to insert tokens into a stream of data passed to the graphics pipeline to allow portions of the data to be written to a buffer object after being processed by a particular processing unit in the graphics pipeline. This approach is used by the Open GL feedback mode and allows developers to insert tokens into the graphics rendering pipeline with a point, line, or triangle. For example, a token may specify to write a set of vertices following the token into a buffer as a triangle or other graphics primitive once the graphics processing unit has processed the vertices. Different tokens are typically provided for different graphics primitives, resulting in a buffer format of:
<token_triangle> <triangle data>, <token_point> <point data>
One drawback of this approach is that the results of the graphics processing pipeline are written to a CPU system memory buffer. The data written to the system memory buffer may include both the results of the graphics rendering pipeline as well as which tokens were passed through the pipeline with the data. Thus, the graphics pipeline cannot directly process the results of the graphics rendering pipeline. More specifically, before the graphics pipeline can receive any of the transformed data stored in a buffer object using this approach, the transformed data must first be copied from the buffer object into system memory, parsed by the CPU, formatted into an appropriate form, and then passed back to the graphics pipeline for further processing. Thus, this approach incurs a substantial performance penalty.
In addition, some graphics APIs have allowed graphics developers to compose a shader program that enables certain attributes of transformed vertices to be written to a buffer object as part of the shader program. However, this approach directly ties writing certain vertex attributes to the buffer object to a particular shader program. If the graphics developer desires to change which attributes are written to the buffer object, then the shader program currently bound to the relevant processing unit must be unbound from that processing unit and a new shader program that includes instructions for writing the desired transformed vertex attributes to the buffer object must then be bound to the processing unit. This process must be followed, even when the only difference between the two shader programs lies in which transformed vertex attributes each program writes to the buffer object. The delay created by unbinding a shader program from a processing unit just to change which transformed vertex attributes are written to the buffer object may, in some cases, cause an unacceptable performance bottleneck for data processing on the GPU, thereby limiting the usefulness of this API feature.
As the foregoing illustrates, what is needed in the art is way to access transformed vertex data in a graphics processing pipeline that avoids one or more of the problems set forth above.