Rendering a computer-generated image on a screen of pixels entails several steps. Conventionally, the image is first decomposed into many primitive objects, most typically triangles. Each triangle is then transformed into a screen-aligned coordinate system. Thereafter, each triangle is divided into fragments in a rasterization process. Each fragment corresponds to each screen pixel covered by the triangle.
Each fragment has associated with it a number of data items, including one or more color, depth and stencil values. The color values are used to establish the colors of the fragments. The depth values are used to determine which fragments will be visible on the screen (z-buffering). The stencil values are used to determine which fragments are to be rendered. (Since z-buffering and stenciling are closely related, depth and stencil values are typically stored in the same buffer.) The fragment color, depth and stencil values are written to the display memory if the fragment is determined to be properly rendered and visible.
A typical scene is composed of many triangles. As each triangle covers a number of pixels, the number fragments to be written to the display memory can be large. For instance, a scene may be composed of 1,000,000 triangles, each of which may cover 50 pixels. If the scene is rendered 60 times a second, 3,000,000,000 fragments must be generated, processed and sent to the frame buffer every second.
If each such fragment carries about ten bytes of data, 30 Gbytes of data must be processed and stored every second. Further, many applications arithmetically blend newly rendered fragments with the contents of the frame buffer, doubling the data that must be transferred to and from the frame buffer.
The foregoing problem is exacerbated if anti-aliasing is performed. In the most common anti-aliasing algorithms, supersampling and multisampling, multiple fragments are computed and stored in the frame buffer for every screen pixel in order to reduce sampling artifacts in the rendered image (See U.S. Pat. No. 6,072,500). Anti-aliasing using these techniques therefore increases the load imposed on the fragment processing stage of a graphics system proportionally to the number of samples per pixel.
Processing a large number of fragments is difficult for a variety of reasons. Frame buffer accesses for reading and writing pixel data require a large amount of frame buffer bandwidth. Therefore, in many systems, the available frame buffer bandwidth limits the fragment-processing rate. Similarly, transferring the fragments among the internal stages of a graphics system demands a high internal bandwidth in the fragment processing stage, which also tends to limit the fragment-processing rate. So too, processing the fragments as they travel through the graphics system consumes a large amount of processing power, e.g., for stenciling, z-buffering or alpha blending. Available processing power may also limit the fragment-processing rate.