Field of the Invention
The present invention generally relates to three-dimensional (3D) graphics processing, and, more particularly, to efficient super-sampling with per-pixel shader threads.
Description of the Related Art
Computer generated images that include 3D graphics objects are typically rendered using a graphics processing unit (GPU) with one or more multistage graphics processing pipelines. Such graphics pipelines include various programmable and fixed function stages. Programmable stages include various processing units that execute shader programs to render graphics objects and to generate various visual effects associated with graphics objects. One example of a programmable stage is a fragment processing unit that includes a pixel shader program. Pixel shader programs receive the geometry fragments, such as line segments and triangles, and compute color information, depth information, and other attributes of each individual pixel. The resulting pixel information is stored in output registers. The output registers are subsequently read by a fixed function stage known as the raster operations unit or ROP. The ROP receives pixel color, depth, and other information from the pixel shader program, blends this pixel information with corresponding pixel information stored in one or more render targets, and stores the blended pixel information back into the one or more render targets. Typically, the blending operations in the ROP are limited to a set of fixed function operations.
Certain blending effects are not achievable within the ROP, due to the fixed function nature of the ROP. To create such blending effects, the pixel shader program may include one or more programmable blending features, where the pixel shader program reads pixel information directly from the render targets (destination pixel information), blends the pixel information with the pixel information calculated by the pixel shader program (source pixel information), and stores the blended pixel information into the output registers. The pixel shader program flexibly performs a programmable blend on the pixel information. Accordingly, blending is not restricted to the fixed function blending operations included in the ROP. One drawback with this approach is that processing order of pixel shaders is generally not guaranteed in a GPU with multiple instances of pixel shader programs running at the same time, or with graphics processing pipelines. Certain sequential sets of blending operations perform properly if graphics objects are blended in a specific order. In one example, two graphics objects could intersect with a given pixel. The first graphics object could be blended into a render target by a first graphics processing pipeline. The second graphics object could be blended into a render target by a second graphics processing pipeline. However, the result of the blending operation could be different depending on whether the first graphics object is blended into the render target before the second graphics object or after the second graphics object. As a result, the GPU would not consistently blend graphics objects correctly.
In some applications, image quality is improved by rendering multiple samples for each pixel, where each sample can represent a subset of the area covered by a corresponding pixel. Such a mode is called super-sampling mode. In a render target configured for super-sampling, each pixel is stored as multiple samples, where each sample can include color information, depth information, and other attributes. With super-sampling, a single instance of the pixel shader program calculates and stores the color, depth and related information for only one sample. Once rendering completes, the samples for a given pixel are combined, resulting in the final pixel color for display on the display device. One drawback with this approach is that each instance of the pixel shader program consumes a separate processing element in the GPU. For pixels that include four samples, super-sampling consumes four times the pixel shader resources as compared to consuming only one pixel shader per pixel.
As the foregoing illustrates, what is needed in the art is an improved technique for performing pixel shading operations in a graphics processing pipeline.