Computer graphics systems typically utilize instructions, implemented via a graphics program on a computer system, to specify calculations and operations needed to produce two-dimensional or three-dimensional displays. Exemplary graphics systems that include APIs that are commercially available for rendering three-dimensional graphs include Direct3D, available from Microsoft Corporation, of Redmond, Wash., and OpenGL by Silicon Graphics, Inc., of Mountain View, Calif.
Computer graphics systems can be envisioned as a pipeline through which data pass, where the data are used to define an image that is to be produced and displayed. At various points along the pipeline, various calculations and operations that are specified by a graphics designer are used to operate upon and modify the data.
In the initial stages of the pipeline, the desired image is described by the application using geometric shapes such as lines and polygons, referred to in the art as “geometric primitives.” The derivation of the vertices for an image and the manipulation of the vertices to provide animation entail performing numerous geometric calculations in order to eventually project the three-dimensional world being synthesized to a position in the two-dimensional world of the display screen.
Primitives are constructed out of “fragments.” These fragments have attributes calculated, such as color and depth. In order to enhance the quality of the image, effects such as lighting, fog, and shading are added, and anti-aliasing and blending functions are used to give the image a more realistic appearance. The processes pertaining to per fragment calculation of colors, depth, texturing, lighting, etc., are collectively known as “rasterization”.
The fragments and their associated attributes are stored in a frame buffer. Once rasterization of the entire frame has been completed, pixel color values can then be read from the frame buffer and used to draw images on the computer screen.
To assist in understanding a typical computer graphics system, consider FIG. 1 which illustrates, generally at 100, a system that can implement a computer graphics process. System 100 comprises a graphics front end 102, a geometry engine 104, a rasterization engine 106, and a frame buffer 108. System 100 can typically be implemented in hardware, software, firmware or combinations thereof, and is also referred to as a “rendering pipeline”.
Graphics front end 102 comprises, in this example, an application, primitive data generation stage 102a and display list generation stage 102b. The graphics front end generates geographic primitive data consumed by the subsequent pipeline stage(s). Geographic primitive data is typically loaded from a computer system's memory and saved in a display list in the display list stage 102b. All geometric primitives are eventually described by vertices or points.
Geometry engine 104 comprises, in this example, high order surface (HOS) tessellation 104a, and per-vertex operations stage 104b. In stage 104a, primitive data is converted into simple rasterizer-supported primitives (typically triangles) that represent the surfaces that are to be graphically displayed. Some vertex data (for example, spatial coordinates) are transformed by four-by-four floating point matrices to project the spatial coordinates from a position in the three-dimensional world to a position on the display screen. In addition, certain other advanced features can also be performed by this stage. Texturing coordinates may be generated and transformed. Lighting calculations can be performed using the vertex, the surface normal, material properties, and other light information to it produce a color value. Perspective division, which is used to make distant objects appear smaller than closer objects in the display, can also occur in per-vertex operations stage 104b. 
Rasterization engine 106 is configured to perform so-called rasterization of the re-assembled rasterizer-supported primitives. It comprises the following stages: triangle/point assembly 106a, setup 106b, parametric evaluation 106c, depth and stencil operations stage 106d, per-fragment operations stage 106e, and the blend and raster operations (ROP) stage 106f. 
Rasterization refers to the conversion of vertex data connected as rasterizer-supported primitives into “fragments.” Each fragment corresponds to a single element (e.g., a “pixel” or “sub-pixel”) in the graphics display, and typically includes data defining color, transparency, depth, and texture(s). Thus, for a single fragment, there are typically multiple pieces of data defining that fragment. To perform its functions, triangle/point assembly stage 106a fetches different vertex components, such as one or multiple texture component(s), a color component, a depth component, and an alpha blending component (which typically represents transparency).
Setup stage 106b converts the vertex data into parametric function coefficients that can then be evaluated on a fragment coordinate (either pixel or sub-pixel) by fragment coordinate basis. Parametric evaluation stage 106c evaluates the parametric functions for all the fragments which lie within the given rasterizable primitive, while conforming to rasterizable primitive inclusion rules and contained within the frame buffer extents.
Depth and stencil operations stage 106d perform depth operations on the projected fragment depth and application specified fragment stencil operations. These operations apply to both the comparison function on the depth and stencil values, how the depth and stencil values should be updated in the depth/stencil buffer and whether the fragment should terminate or continue processing. In the idealized rasterization pipeline these operations take place just before frame buffer write-back (after blend and ROP stage 106f), but commonly these operations are valid before the per-fragment operations stage 106e, which enables early termination of many fragments and corresponding performance optimizations/improvements.
Per-fragment operations stage 106e typically performs additional operations that may be enabled to enhance the detail or lighting effects of the fragments, such as texturing, bump mapping, per-fragment lighting, fogging, and other like operations. Near the end of the rasterization pipeline is the blend and raster operation (ROP) stage 106f, which implements blending for transparency effects and traditional 2D blit raster operations. After completion of these operations, the processing of the fragment is complete and it is typically written to frame buffer 110 and potentially to the depth/stencil buffer 108. Thus, there are typically multiple pieces of data defining each pixel.
Now consider so-called “depth sorting” as it pertains to rendering 3D graphics. Depth sorting in computer graphics is typically accomplished using what is referred to as a “depth buffer”. A depth buffer, often implemented as a “z-buffer” or a “w-buffer”, is a 2D memory array used by a graphics device that stores depth information to be accessed by the graphics device while rendering a scene. Typically, when a graphics device renders a 3D scene to a render surface, it can use the memory in an associated depth buffer surface as a workspace to determine how the pixels or sub-pixels of rasterized polygons occlude one another. The render surface typically comprises the surface or buffer to which final color values are written. The depth buffer surface that is associated with the render target surface is used to store depth information that tells the graphics device how deep each visible pixel or sub-pixel is in the scene.
When a 3D scene is rasterized in a rasterization pipeline with depth buffering enabled, each point on the rendering surface is typically tested. A depth buffer that uses z values is often called a z-buffer, and one that uses w values is called a w-buffer. Implementations may alternatively user other depth values such as 1/z or 1/w. While these invert the sense of increasing values with increasing depth, this fact is typically hidden from the application and can therefore just be thought of as a simple z or w buffer.
At the beginning of rendering a scene to a render target surface, the depth value in the depth buffer is typically set to the largest possible value for the scene. The color value on the rendering surface is set to either the background color value or the color values of the background texture at that point. Once a fragment has been generated at a given coordinate (x,y) on the rendering surface, the depth value—which will be, for example, the z coordinate in a z-buffer, and the w coordinate in a w-buffer—at the current coordinate is tested to see if it is smaller than the depth value stored in the depth buffer. If the depth value of the polygon is smaller, it is stored in the depth buffer and the color value from the polygon is written to the current coordinate on the rendering surface. If the depth value of the polygon at that coordinate is larger, the fragment is terminated, so the depth buffer retains the smallest value at the current coordinate. This process is shown for opaque polygons diagrammatically in FIG. 2.
There, notice that two polygons 200, 202 overlap along a ray that is associated with a current coordinate 204 of interest. When the 3D scene is rasterized, each coordinate on the rendering surface is typically tested. Here, the corresponding location in depth buffer 206 corresponding to pixel (or sub-pixel) 204 is set to the largest possible value for the scene. The color value on the rendering surface for this location can be set to a background color. Polygons 200 and 202 are effectively tested during rasterization to ascertain whether they intersect with the current coordinate on the rendering surface. Since both polygons intersect with the current coordinate on the rendering surface, the depth value of polygon 200 at the current coordinate is effectively tested to see whether it is smaller than the value at the current coordinate in the depth buffer. Here, since the depth value of the polygon 200 for the associated coordinate is smaller than the current depth value, the depth value for polygon 200 at the current coordinate is written to the depth buffer and the color value for the polygon at the current coordinate is written to the corresponding location in the rendering surface (also referred to as the color buffer). Next, with the depth buffer holding the depth value for polygon 200 at the current coordinate, the depth value of polygon 202 at the current coordinate is tested against the current depth value in the depth buffer. Since the depth value of polygon 202 at the current coordinate is smaller than the depth value of polygon 200 at the current coordinate, the depth value of polygon 202 at the current coordinate is written to the corresponding depth buffer location and the color value for polygon 202 at the current coordinate is written to the corresponding location on the rendering surface.
In this manner, in the ultimately rendered image, overlapping portions of polygon 202 at the current coordinate will occlude underlying portions of polygon 200 at the current coordinate.
Now consider what is a fundamental problem in 3D graphics—that which is referred to as transparent depth sorting.
To appreciate this problem, consider that there are typically two different types of pixels—opaque pixels and transparent pixels. Opaque pixels are those pixels that pass no light from behind. Transparent pixels are those pixels that do pass some degree of light from behind.
Consider now FIG. 3 which shows a viewer looking at a scene through one exemplary pixel 300 on a screen. When an object is rendered by a 3D graphics system, if the object is to appear as a realistic representation of what a viewer would see in the real world, then this pixel should represent all of the light contributions, reflected back towards the viewer, that lie along a ray R. In this example, ray R intersects three different objects—an opaque mountain 302, a first transparent object 304 and a second transparent object 306. The nearest opaque pixel to the viewer is pixel 302a which lies on the mountain. Because this pixel is opaque, no other pixels that might be disposed behind this pixel on the mountain will make a contribution to the ultimately rendered pixel.
In the real world, transparent objects 304, 306 cause the light that is reflected back towards the viewer to be affected in some way. That is, assume that objects 304, 306 are glass or windows that have some type of coloration and applied lighting and/or environmental effects. The effect of these windows is to slightly dim or otherwise attenuate the light that is associated with pixel 302a then apply lighting and/or environmental effects. In the real world, the viewer's eye effectively sums all of the light contributions to provide a realistic image of the distant mountain. In the 3D graphics world, this is not an easy.
Specifically, assume that pixel 302a has associated color values that describe how that pixel is to be rendered without any transparency effects applied. The influence of the right side of object 304 at 304a, and the left side of object 304 at 304b, with applied lighting 308, will change the color values of the associated is pixel. Similarly, the influence of the right side of object 306 at 306a, and the left side at 306b, with applied lighting 310, will further change the color values of the associated pixel.
Thus, if one wishes to accurately render pixel 302a, one should necessarily take into account the transparency and lighting effects of these lighted transparent objects, which requires back to front drawing ordering of the transparent objects' contributions to the given pixel.
The traditional depth buffering techniques described above do nothing to alleviate the back to front rendering order problem. Specifically, the traditional depth buffering techniques essentially locate the closest pixel (i.e. the pixel with the smallest z value) and then write the pixel's color values to the color buffer. There is no back to front ordering, with partial application of the overlying pixel's color values (and inversely partial retaining of the current color value). Thus, traditional depth buffering techniques do not take into account this transparency issue.
There have been attempts in the past to solve this particular transparency depth sorting issue. For example, one solution to this problem is to push the problem onto the application programmer. For example, the application programmer might resolve this issue by drawing all of the opaque objects first, and then perform some type of inexpensive bounding box or bounding sphere processing, and present the resulting data to a graphics engine in back-to-front order. This can unnecessarily burden the application programmer.
Another general scheme to attempt to solve the transparency depth sorting problem is known as the “A-buffer” approach. This approach creates a per pixel linked list of all of the pieces of per pixel data as a frame is being drawn. For every pixel in the frame buffer, there is a linked list of fragments. These fragments embody the contributions of the various objects at the given coordinate that are in the scene. The A-buffer approach is a very general method that essentially collects all of the linked list data for each pixel, and after all of linked list data is collected, resolves the back to front issues, on a pixel by pixel basis, after the scene is completely drawn. In the context of a resource-rich environment where time is not a factor, this approach is acceptable as the software program simply operates on the data as the data is provided to it.
One problem with the A-buffer approach, however, is most easily appreciated in environments that are not necessarily resource-rich, and where time is, in fact, a factor, e.g. the gaming environment where it is desirable to render real time 3D graphics. With the A-buffer approach, the size of the linked lists and all of the data in the linked lists can be quite large. Today, it is not economical to have a frame buffer that is so large as to support the size of the linked lists. While the results that are produced using the A-buffer approach are nice, the costs associated with attaining such results are not appropriate for the real time environment.
Accordingly, this invention arose out of concerns associated with providing improved graphics systems and methods.