This invention relates generally to the field of computer graphics display, and more particularly to reducing the amount of time needed to render complex virtual scenes on a computer graphics display.
Computers have been used for many years to do image and graphics generation. In recent years these computer-generated graphics have become more and more sophisticated. As the power of computer equipment increases, the users"" expectation of what the computer should do also increases. One area that has been accelerating rapidly is computer-generated imagery with increasing scene complexity. Computer users have come to expect more realism that generally means that there are more objects and more lighting and texture processing on those objects.
Complex images and scenes are modeled in three-dimensional space in the computer memory and manipulated accordingly. A complex three-dimensional shape is broken down into basic graphic shapes called primitives. Modeling techniques and tools describe the virtual environment with primitives. Primitives include such things as polygons, meshes, strips and surface patches. Some graphics architectures employ optimized algorithms for handling simple primitives such as dots, lines and triangles. Before a three dimensional scene can be viewed by the user it must be translated from the three dimensional view in the computer to a two dimensional view which can be displayed on a two dimensional screen or monitor.
The process of translating the three-dimensional image to a flat display device is called rendering. The rendering process takes place in the graphics hardware or software and converts primitives into a two-dimensional array of graphical points. These points are known as pixels and are stored in computer memory in a frame buffer before they are drawn on the screen. The frame buffer is a rectangular two-dimensional array and is M by N pixels, where M and N depend on the display system. The computer draws multiple frames consecutively at many frames per second to animate the virtual environment being viewed. Graphics display techniques can also incorporate sub-pixels that are logical sub-divisions of a pixel. The color values of the sub-pixels are later combined or averaged together to form an actual pixel. Generally, techniques that can be applied to pixels can also be applied to sub-pixels.
The number of frame buffer pixels which must be displayed is constant for any given computer screen. However, the number of pixels that must be computed in order to fill the frame buffer is highly dependent on the complexity of the virtual scene. For each actually displayed pixel, a number of pixels may be rendered based on the number of primitives that cover the pixel in the scene. In other words, a calculation is made for each primitive graphic object that is in the line of sight of the pixel. The ratio of the number of rendered pixels relative to the number of displayed pixels is known as the average pixel depth complexity. This indicates the average number of primitives that cover each pixel on the screen. Depth complexity numbers indicate the amount of processing or work it takes to create each image, and these numbers vary greatly depending on the modeled environment and the viewer""s current position in that environment.
For example, in a rendering of a region of mountainous terrain covered with trees as viewed from above, the average depth complexity lies somewhere between one and two. The peak depth complexity may be two. Pixels displaying the terrain only need one calculation, while pixels covered by a tree need two calculations, one for the tree and one for the terrain. If the viewer""s position is moved down within the trees, with a line of sight toward the horizon, the depth complexity numbers will increase dramatically. If the forest is quite dense, the average depth complexity may go up into the tens while the peak may even approach the hundreds. As the model complexity increases the depth complexity numbers will also increase.
Many pixels rendered in high depth complexity scenes never contribute to the final image. This occurs because the primitives to which they belong are located farther away and behind other primitives in the scene and are therefore not visible to the user. The additional unused calculations increase the amount of hardware or time required to render a given scene.
As the virtual enviromnent""s complexity increases, the demand on the rendering process also increases. If the rendering is done in accelerated graphics hardware, it can become quite costly because of the large number of calculations required to be implemented in hardware. For software based rendering systems, the rendering time can become very slow. In either case, if the rendering is too slow, the movement of the display image becomes disjointed or choppy when the image is displayed.
Various techniques have been used to reduce the amount of hardware or computing time needed to render increasingly complex scenes. These techniques attempt to reduce the number of pixels rendered which do not contribute to the final image. Most current systems use a brute-force approach to converting modeled primitives into viewable pixels. Each primitive is taken individually and projected from the three-dimensional model coordinates into a two-dimensional frame buffer space in memory. Then the process calculates which pixels within the frame buffer the primitive touches. Computing which pixels are touched is a process known as scanning. Scanning selects each pixel and computes its color as determined by the modeled attributes of the primitive. Computing the pixel color can be very complex if sophisticated lighting algorithms and textures are being used. Typical factors contributing to the pixel""s color include the modeled color, light sources shining on the primitive, texture, anti-aliasing, and visibility conditions.
A mechanism must also be provided to determine which primitive in the scene should be visible for any given pixel (or sub-pixel if anti-aliasing techniques are employed). This process is often referred to as hidden-surface-removal. For example, all the primitives or surfaces which are hidden by other surfaces within the scene are removed. Common hidden-surface-removal techniques include the painter""s algorithm, list-priority algorithms, scan-line algorithms, and Z-buffering (or depth buffering).
Each of these hidden-surface-removal techniques has it own advantages or disadvantages.
For a number of reasons, the Z-buffer method has now become a very popular choice. Most of the other approaches require special modeling techniques and support data structures to render the image properly. The Z-buffer approach eliminates most of these constraints and simplifies the modeling process. In the Z-buffer approach, the visible primitive or surface at each pixel is the primitive with the closest Z value. The Z value is basically the depth of the primitive in the viewed scene. As each primitive is rendered, this Z parameter can be computed for each pixel touched. The frame buffer is also expanded to store the Z depth, along with storing the pixel color. As each new primitive is processed, the new Z depth can be compared with one already stored in the frame buffer. The frame buffer only keeps the pixels rendered for the primitive closest to the observer.
A major disadvantage of the Z-buffer is that all of the color shading calculations are performed before the depth test is done. Pixels are only discarded by the frame buffer circuit after the color shading calculation is done. This requires a lot of expensive or time consuming calculations to be performed with no final contribution to the image on the screen.
Other hidden-surface-removal strategies have developed more cost effective architectures. An example is the list-priority approach, where the primitives are rendered in a front-to-back order. By recording which pixels (or pixel arrays) are filled up by primitives as they are rendered, later primitives can be tested against this record. This test avoids wasted time processing the primitive against pixels that are already full. Simple structures can be built to maintain and test against this full record, throwing out pixels before the expensive color shading calculations are performed. Thus, even though the depth complexity of the scene may be quite high, many of the pixels that would be thrown away are simply skipped because of this test.
One major disadvantage of the list-priority approach is that primitives must be modeled in such a way as to guarantee that they can be sorted into priority order. In some cases, this is extremely difficult. The list-priority approach also does not support the notion of interpenetrating primitives.
In general, the various hidden-surface-removal techniques provide either an efficient rendering architecture at the expense of complex modeling (e.g., the list-priority approach), or they simplify the modeling process at the expense of rendering efficiency (e.g., the Z-buffer).
Some recent systems have combined the xe2x80x9csort and recordxe2x80x9d schemes used previously by list-priority machines with the distinct modeling advantages of Z-buffered systems. This approach works well, but it is extremely expensive in terms of hardware and computation time. First, large database sorting methods are used to get the primitives in approximately a front-to-back order. Z-buffer techniques are used to do the final resolution of which primitive covers each pixel. The simple fill record used by the list priority architecture is replaced with a more complex depth-based full record.
State of the art graphics systems which have utilized a full buffer have performed a full buffer update process by examining every pixel (and should be considered to include sub-pixels) in a pixel array within a selected portion or region of the frame buffer. A comparison is made of the depth value for every pixel to determine the maximum depth within the array. This approach is very costly (in time or hardware) since many pixels must be accessed and compared in order to determine whether the pixel array is completely covered by primitives and at what maximum depth. Once the region is completely covered, it can be marked full, regardless of how many primitives it took to cover it If the new primitive""s depth is farther than that recorded in the full record, that particular array of pixels need not be rendered for the new primitive.
The memory and controllers used for database sorting, the minimum and maximum depth calculations, and the depth based full buffer all add substantially to the cost of such a hybrid system. The advantages gained by such an approach are particularly of value for applications requiring true real-time performance since the rendering load will be much more level than on a system without such capabilities. Without a means to skip filled regions, the rendering load will be directly proportional to the depth complexity of the scene. By employing these xe2x80x9cfull recordxe2x80x9d schemes, the rendering load is more directly tied to the screen""s resolution and not so much to the orientation of the database. Unfortunately, the approach of combining a Z-buffer and a depth-based full record is far too costly for mainstream graphics systems.
An object of this invention is to provide cost-effective, enhanced methods to reduce the pixel rendering load when generating a synthetic scene on a computer graphics display.
Another object of this invention is to provide a simplified full buffer architecture that eliminates the rendering of covered pixel arrays for primitives prior to expensive pixel shading operations.
Another object of this invention is to enhance the full buffer architecture so that it significantly improves the efficiency of the rendering process, by reducing the hardware or computing time required to render a graphics scene.
It is another object of this invention to improve the fill buffer architecture by expanding on the types of scene modeling techniques that can benefit from selective pixel rendering.
Another object of this enhanced full buffer architecture is to provide a mechanism to help balance the geometric transformation and pixel rendering loads through using the full buffer.
The present invention provides a simplified full buffer architecture to reduce the pixel rendering load across a wider range of complex scenes by eliminating the rendering of pixel arrays for covered primitives prior to pixel shading. Another aspect of the present invention provides a method for detecting multiple primitives that together fill a pixel region. Thus, higher order model primitives such as strips, fans, quadrilaterals and meshes, can be used in combination to increase the effectiveness of the full buffer by being enabled together to fill a scanned pixel region on the screen. A new system or method is also provided which enables skipping already full regions to be used on models that consist of numerous layers of coincident polygons. Coincident primitives may be coplanar (i.e. stripes on a flat runway) but do not have to be (i.e. decal on a sphere). This is a significant benefit since layers of coincident polygons have a tendency to dramatically increase depth complexity.
Another aspect of the first embodiment of the present invention is related to finding the closest or nearest point within a pixel array stored in the full buffer. Finding the closest or nearest point is usefull to test when certain pixel arrays do not require processing and are thus bypassed. It is also useful to determine the furthest point within a pixel array to store back in the full buffer. The furthest point is the opposite comer of the primitive relative to the closest point. The furthest point is stored in the fill buffer when non-bypassed primitives fill an array region.
The simplified fill buffer method encounters each new array of pixels during scanning, and a comparison is done between the closest depth of the primitive currently being rendered and the full depth for the pixel array retained in the full buffer. If the closest primitive depth is farther than the full buffer""s stored depth, then pixels associated with the primitive within the array will not contribute to the final image and can be skipped. The scan conversion process then seeks the next pixel array in the frame buffer for the primitive, and the process repeats. For example, in a situation where the depth value increases with increased distance from the viewer, if the pixel array encountered during scanning was already filled or covered by a primitive at depth 2 and the primitive currently being rendered was of depth 4, then the pixel array processing would be skipped.
If the closest primitive depth is not farther than the full depth, the complete pixel array must be scanned for this primitive. For each of the scanned pixels, a shade is computed which may include transparency. As the array is scanned, a cumulative record is kept of the primitive""s coverage of this array. If the primitive completely covers each pixel (or sub-pixel) of the array, its farthest depth value within the array is stored in the full buffer. This marks the pixel array full at that given depth. The full depth is only stored in the full buffer array if it is closer than the at depth currently found in the full buffer. By checking for full pixel arrays early in the process, the expensive pixel shading operations are not performed for areas that would simply be discarded by the Z-buffer.
The enhanced full buffer feature of the present invention, which allows multiple primitives to combine to cover pixel arrays, is especially advantageous over the simplified full buffer system, where coverage can only be accomplished by a single primitive at a time. In the simplified fill buffer system, full coverage is determined by counting how many pixels are visited within the array by a single primitive and ensuring that all the pixels (or sub-pixels) are filled and opaque. The pixel count does not contain any information as to which pixels had been covered, just how many. This problem becomes even worse when sub-pixels are considered. If the pixel array is not completely covered, the partially covered results are simply discarded. When multiple independent primitives are considered, the problem becomes apparent. There is no way to ensure that the primitives do not overlap and hence cover some pixels more than once and some pixels not at all.
The enhanced full buffer architecture of the present invention takes advantage of the nature of connected primitives. Modeling tools now make significant use of these connected primitives such as triangle fans, strips, or meshes, to model surfaces, buildings and terrain. Rendering hardware often accepts these higher order objects and breaks them down into their constituent triangles. Each triangle (or primitive) of a connected primitive shares a pair of vertices and a corresponding edge-with the triangle that precedes it. The triangle may also share a pair of vertices and a corresponding edge with the triangle that follows it in the connected object. An example is shown in FIG. 1.
It is important that the shared edge has exactly complementary coverage. This means that the pixels (or sub-pixels) along the shared edge are completely covered but each pixel is claimed by only one of the triangles.
The preferred embodiment of the invention shown in FIG. 3 suggests that several triangles of a connected primitive could combine to cover a pixel array. However, this is not always the case. In order for the coverage of multiple connected primitives to be truly complementary, there can be no overlap of any of the primitives involved, as shown in the connected primitive in FIG. 2.
The pixel arrays in the overlap region cannot correctly resolve the coverage and hence erroneous results may occur. It is noted that any pair of two triangles of a connected primitive cannot overlap unless one of them is backfacing (i.e. facing away from the viewer""s viewpoint). Normally, the geometric transformations identify and discard backfacing primitives. When backfacing primitives are discarded, the connected primitive is interrupted and a xe2x80x9cnewxe2x80x9d one started with the next frontfacing primitive in the list. As a result, the complementary nature of coverage within connected primitives allows pairs of triangles to combine to fill pixel arrays. This also covers the significant modeling construct known as convex quadrilaterals that are often used to model trees and buildings. It is also noted that the pair of primitives need not lie in the same plane. This allows many other modeled features such as terrains and hillsides that contain connected primitives in different planes to jointly fill pixel arrays.
In summary, one embodiment of the present invention is the simplified full buffer architecture shown in FIG. 3. A method and system are provided for increasing a rate of rendering a synthetic image on a computer graphics display, wherein the synthetic image is generated from a database of a plurality of primitives. The system determines which pixels require processing for each of the plurality of primitives. Then a comparison is made of the depth values of the primitives requiring processing against depth values in a fill buffer to thereby determine whether the pixels in a region require further processing. Next, the system skips shading value calculations for pixels not requiring further processing.
A second embodiment of the present invention is the enhanced full buffer system shown in FIGS. 4 and 5. The second embodiment of the invention makes significant improvements to the skip and recording sections of simplified full buffer architecture shown in the first embodiment. Those improvements are described in the following order: (1) extensions to the full buffer, (2) the addition of a xe2x80x9cpartial buffer,xe2x80x9d (3) skipping already filled pixel arrays, and (4) recording pixel arrays as they fill.
The enhanced fill buffer embodiment comprises a computer graphics rendering system for efficiently. rendering three-dimensional scenes of high depth complexity with a graphics display unit. The system bypasses rendering operations for frame buffer regions in which the pixels are completely covered by primitives. The system accumulates coverage of frame buffer regions by both individual primitives and groups of primitives. A full buffer having memory locations is arranged for recording frame buffer regions which are completely covered. Finally, a partial buffer is coupled to the full buffer to enable multiple primitives to combine to fill frame buffer regions.