The present invention relates generally to computer graphics, and more particularly to a system and method for reducing memory and processing bandwidth requirements of a computer graphics system by using a buffer in a graphics pipeline to merge selected image fragments before they reach a frame buffer.
Many computer graphics systems use pixels to define images. The pixels are arranged on a display screen, such as a raster display, as a rectangular array of points. Two-dimensional (2D) and three-dimensional (3D) scenes are drawn on the display by selecting the light intensity and the color of each of the display""s pixels; such drawing is referred to as rendering.
Rendering a scene has many steps. One rendering step is rasterization. A scene is made up of objects. For example, in a scene of a kitchen, the objects include a refrigerator, counters, stove, etc. Rasterization is a process by which the following is determined for each object in the scene: (1) identifyng the subset of the display""s pixels that are contained within the object, and then for each pixel in this subset, (2) identifying the information that is later used to determine the color and intensity to assign to each pixel. Rasterization of an object generates a fragment for each pixel the object either fully or partially covers, and the information identified in (2) above is called fragment data.
A scene may be composed of arbitrarily complex objects. Before rendering such a scene by a computer system, a process called tessellation decomposes the complex objects into simpler (primitive), planar objects. Typically, systems decompose the complex objects into triangles. For example, polygons with four or more vertices are decomposed into two or more triangles. Curved surfaces, such as on a sphere, are also approximated by a set of triangles. These triangles are then are then rasterized. Though with minor modifications the invention could work with primitives with more sides, for example, quadrilaterals, hereafter we assume that all surfaces are tessellated into triangles. xe2x80x9cPrimitivesxe2x80x9d with more sides will only arise as a consequence of merging fragments from two or more triangles.
In FIG. 1, a tessellated surface 30 has three primitive objectsxe2x80x94triangle one 32-1, triangle two 32-2 and triangle three 32-3. The edges of the tessellated surface 30 are depicted with wide lines. To illustrate the rasterization process, the tessellated surface 30 is superimposed on an exemplary pixel grid 40. Each pixel 42 of the pixel grid 40 is represented by a square. The rasterization process generates a fragment for each primitive object that is superimposed on a pixel 42.
In the rasterization process, a finite array of discrete points, each point representing the center of a pixel of the display device, is used to construct a regular grid, for example the pixel grid 40. To construct such a grid, a filter kernel is placed over each of the discrete points. The two-dimensional bounding shape of the portion of the filter that has non-zero weight is sometimes called the support in signal processing theory, but is commonly referred to as the footprint. In the general case, the filter footprints of neighboring pixels overlap each other and thus intersect. Typically, hardware-based rasterizers use filter footprints that are 1xc3x971 pixel squares and thus do not overlap. Such a filter was used to create pixel grid 40. Each square in pixel grid 40 is the filter footprint of a 1xc3x971 pixel square filter placed over the discrete pixel point at the center of the square. This pixel grid 40 is used to generate fragments.
The fragments of an object are obtained by projecting the object onto the pixel grid. A fragment is then generated for a given pixel if the footprint of the filter located over the pixel intersects the object. To illustrate the rasterization process, rasterization of the three triangles 32 yields a number of fragments for each triangle 32. Within each pixel 42, the number enclosed by a circle is the number of fragments that are generated for that pixel on behalf of one or more primitive objects. For example, since tessellated surface 30 does not cover pixel 42-1, no fragments are associated with pixel 42-1. Since triangle 32-2 partially covers pixel 42-2, one fragment 44 is associated with pixel 42-2. Since all three triangles 32-1, 32-2 and 32-3 partially cover pixel 42-3, three fragments 46 are generated for pixel 42-3. Because none of the three fragments 46-1, 46-2, 46-3 fully cover pixel 42-3, pixel 42-3 is displayed with a color that is a combination of the three fragments 46-1, 46-2, 46-3 and the background color.
The grid 40 depicts the filter footprints obtained by locating a filter with a 1xc3x971 pixel square footprint over each pixel center in the pixel grid. For example, square 48 in grid 40 represents the footprint of the filter that is centered over the point in the pixel grid that corresponds to pixel 50. The color and intensity of a fragment is obtained by sampling the object""s color and intensity at each point of intersection with the pixel""s filter footprint, weighing each sample by the value of the filter at the corresponding point, and accumulating the results.
After rasterization, texture mapping is typically applied. Texture mapping is a technique for shading surfaces of objects with texture patterns, thereby increasing the realism of the scene being rendered. Texture mapping is applied to the fragments that correspond to objects for which texture mapping has been specified by the person who designed the scene. Texture mapping results in color information that is either combined with the existing color information for the fragment or replaces this data.
Once the color information is known for a fragment, the frame buffer is updated. In this step, each newly-generated fragment is either added to or blended with previously-generated fragments that correspond to the same pixel. The frame buffer stores up to N fragments per pixel, where N is greater than or equal to one. When a new fragment f is generated for a pixel P, the frame buffer replaces one of pixel P""s existing fragments with the new fragment f, blends fragment f with one of the existing fragments, or stores fragment f with the existing fragments if fewer than N fragments are currently stored. In such systems, the displayed color of a pixel is obtained by blending together the new fragment f with up to N stored fragments.
Because rasterization of a scene typically yields many fragments for each pixel, the texture-mapping stage and frame buffer often process multiple fragments for the same pixel. In many cases, fragments from two or more adjoining triangles that cover the same pixel may have nearly identical color and depth values because the fragments belong to the same tessellated surface.
Artifacts are distortions in the displayed image. One source of artifacts is aliasing. Aliasing occurs because the pixels are sampled and therefore have a discrete nature. Artifacts can appear when an entire pixel is given a light intensity or color based upon an insufficient sample of points within that pixel. To reduce aliasing effects in images, the pixels can be sampled at subpixel locations within the pixel. Each of the subpixel sample locations contributes color data that can be used to generate the composite color of that pixel.
As shown in FIG. 2, the filter is typically evaluated at a predefined number of discrete points 56 within the footprint. Typically, from four to thirty-two sample points are used. In one approach to sampling, sparse supersampling, these points are xe2x80x9cstaggeredxe2x80x9d on a fine grid. For example, the filter for the pixel 50 is sampled at four points 56, labeled S1, S2, S3, and S4, chosen from a 4xc3x974 array 60 aligned to the center 62 of the pixel 50. The term coverage mask refers to the data that records, for the sample points 56 associated with pixel 50, whether each sample point is inside or outside of the object being rendered. An object is said to fully cover a pixel if all of the sample points for the pixel are inside the object; otherwise the object is said to partially cover the pixel if at least one sample point is inside the object.
Careful examination of a supersampled pixel reveals that the color and depth values at different sample points within a pixel usually differ little from each other, as long as the sample points belong to the same surface. For example, if a pixel is completely covered by a surface, then most of the color and depth values are likely to be fairly similar. This similarity usually holds true even when different sample points belong to different primitives (triangles) of the same tessellated surface.
If a graphics accelerator processes multiple sample points for a single fragment en masse, then it is inefficient to process multiple fragments for a single pixel, when the fragments belong to a single surface that has been tessellated into multiple primitive objects. Therefore, to reduce the memory and processing bandwidth requirements of a graphics accelerator (or equivalently to reduce the amount of processing required to render an object), a method and apparatus are needed that merges fragments from adjoining primitive objects of a tessellated surface that cover the same pixel.
In a graphics pipeline, a rasterizer circuit generates fragments for an image having multiple surfaces that have been tessellated into primitive objects, such as triangles. First and second fragments are associated with the same pixel. A merge buffer merges the first fragment with the second fragment when the two fragments belong to the same tessellated surface, the first fragment""s primitive is adjacent to the second fragment""s primitive, both fragments face either toward or away from the viewer, and the first and second fragment are sufficiently similar that merging is unlikely to introduce visually objectionable artifacts. A frame buffer receives fragments from the merge buffer, stores the fragments, combines the fragments into pixels, and outputs the pixels to a display.
In a particular embodiment, in a graphics pipeline, a rasterizer circuit generates fragments for an image having a tessellated surface. First and second fragments are associated with the same pixel and are also associated with the tessellated surface. Each fragment has an associated depth value and color information. A merge buffer merges the first fragment with the second fragment when the following four criteria are met: (1) the first and second fragments are generated sufficiently close in time, (2) the first fragment""s primitive is adjacent to (shares an edge with) the second fragment""s primitive in 3D space, (3) the first and second fragments"" primitives are oriented similarly in 3D space, and (4) the depth value and color of the first and second fragments are sufficiently similar. This merged fragment may then merge with subsequent fragments if these criteria are again met. A frame buffer receives fragments from the merge buffer, some of which may have been merged; performs a depth test; stores the resulting visible fragments; combines color, transparency, and depth information from all fragments associated with each pixel into a (red, green, blue, alpha transparency) quadruplet; and outputs the quadruplets to a display.
In another aspect of the invention, the merge buffer has a fragment storage storing up to a predetermined number of fragment tuples. Each stored fragment tuple is associated with a fragment. It should be noted that when a fragment is in the merge buffer, the graphics accelerator does not yet know if the fragment will be visible. Each fragment tuple includes a coverage mask, color value, depth (Z) value, and a pair of depth gradient (Z gradient) values. The fragment tuples are also associated with an x-y position tag. A merge pipeline processing circuit processes a new fragment tuple representing a fragment to be added to the pixel. The pipeline processing circuit includes a sequence of pipeline stage circuits. A comparison stage compares an x-y position tag of a new fragment tuple with the x-y position tags of the fragment tuples in the fragment storage and identifies a potentially mergable existing fragment tuple based on a result of the comparison. An evaluation stage compares coverage masks, primitive edges, surface normal vectors, Z values, and color, or a subset thereof, to determine if the new fragment tuple should actually be merged with the potentially mergable fragment tuple. A fragment merging stage merges the color value, the Z value and the pair of Z gradient values of the new fragment tuple and the potentially mergable fragment tuple to generate a merged fragment tuple based on the outcomes of the evaluation stage. An update fragment storage stage stores the merged fragment in the fragment storage.
Merging fragments in the merge buffer increases the rendering speed by reducing the number of fragments sent to the frame buffer to add or merge with a pixel""s existing fragments. This in turn also reduces the amount of work required by the frame buffer to add or merge a new fragment with a pixel""s existing fragments, by decreasing the average number of fragments stored with each pixel. The present invention merges fragments within a pixel from the same surface before the fragments reach the frame buffer. Each time a first and second fragment are merged, the invention avoids both writing the first fragment to the frame buffer, and subsequently reading the first fragment from the frame buffer. Therefore merging fragments in a merge buffer before the fragments reach the frame buffer significantly reduces frame buffer memory bandwidth requirements. This in turn increases the speed of the rendering process for a given amount of memory bandwidth. Alternatively, fewer or less expensive memory chips with less bandwidth may be used. Because fragments are merged, the amount of memory for storing the fragment information, including the subpixel information, may also be reduced. In addition, the present invention employs heuristics that decrease the likelihood that merging will introduce noticeable artifacts.