1. The Field Of The Invention
This invention relates generally to generating two and three dimensional synthetic or virtual reality environments. More specifically, the present invention provides a new method for improving rendering performance by a computer's graphics or display circuitry. By balancing the tasks of geometric computations and pixel rendering, the present invention combines the advantages of two distinct surface-removal strategies to obtain substantially faster graphics rendering.
2. The State Of The Art
Creating and manipulating a computer-generated synthetic or virtual environment in real-time can require substantial computing resources. Many factors influence the success of the synthetic environment to duplicate a desired reality. For example, one factor is the level of scene complexity of the desired synthetic environment. Another factor is the speed at which the synthetic environment must be rendered.
Techniques have been developed for the purpose of enhancing the synthetic environment experience. Enhancement of the synthetic environment is desirable for such purposes as generating a more complex and thus more realistic synthetic environment, or increasing the speed of the computer system to obtain a better real-time experience. However, these techniques for synthetic environment enhancement can be applied regardless of an application's requirements. Accordingly, these techniques might generate the same synthetic environment while demanding less computer resources, or they might generate a more complex synthetic environment utilizing the same computer resources.
Of particular interest is the generation of a synthetic environment in a general purpose computer. As computer technology becomes more powerful, many performance problems that were once solved with dedicated hardware are now being solved utilizing general purpose computers. However, one aspect of displaying a synthetic environment that still often requires dedicated hardware acceleration is that of rendering two and three dimensional computer graphics. Many computer programs have graphics demands which require extremely high image rendering rates. Some applications, such as simulation in a synthetic environment even demands real-time rendering at 30 or even 60 frames per second. Nevertheless, even applications which can perform adequately utilizing a software solution will always perform better with faster image rendering.
As processing power of computers increases, so do the expectations of the user. With respect to computer generated synthetic environments, the user's expectations are that scene complexity and hence realism, continue to increase. It is also expected that these more complex scenes will still be rendered at 30 to 60 frames per second (fps). The advantages of a more realistic environment combined with consistent real-time performance thus extend far beyond the entertainment of a mere game. For example, the military applications of being able to train and test soldiers without the expense of using actual equipment and live ammunition reap substantial benefits.
Before a synthetic image can be rendered on a computer display, the scene must first be modeled in a format that the computer can utilize. Modeling techniques and tools describe the synthetic environment with primitives which can be rendered by the graphics hardware. Renderable primitives often include such things a dots, lines and triangles. Sometimes, higher order surfaces such as polygons, meshes, strips, and surface patches are also supported. These are the lowest order of building blocks or primitives which can be put together to create a synthetic object or scene. Regardless of the method used to model the synthetic environment, the graphics hardware (or software) must convert those primitives into an array of pixels which can be drawn on a display device. The process of converting modeled primitives into visible pixels on the computer display is the rendering process.
As the complexity of the synthetic environment increases, the demand on the rendering process likewise increases. As mentioned, dedicated hardware known as graphics accelerators are often used to improve the rendering performance for the more complex synthetic environments. However, this rendering hardware can become quite expensive because of the vast number of mathematical calculations that must be performed for each pixel. For software based solutions, the process can still be "expensive" in the sense that rendering time can become intolerably slow. This "expense" will be referred to often, and should be assumed to include that actual cost in money of accelerating hardware, or the cost in time when the solution is purely accomplished in software.
Another useful term which will also be referred to is a frame buffer. To render flicker-free synthetic images on a computer display, a double buffered frame buffer is typically used. Once the image has been rendered into one of the buffers, the buffers can be swapped, allowing the image to be displayed on the screen while the next frame image is being rendered into the other buffer.
For a given computer display size, the number of frame buffer pixels which must be displayed remains constant. However, the number of pixels which must be processed in order to fill the frame buffer is highly dependent upon the complexity of the synthetic scene. The ratio of the number of rendered pixels relative to the number of displaced pixels is known as the average pixel depth complexity. This ratio is an indication, on average, of how many primitives cover each pixel on the screen. The peak depth complexity would indicated the number of primitives which cover the pixel which is "touched" the most. These depth complexity values indicate how much effort goes into creating each synthetic image, and the values vary greatly depending on the modeled environment and the current viewer's position within that environment.
For example, when rendering a region of mountainous terrain covered with trees as viewed from above, the average depth complexity will typically lie somewhere between one and two. The peak depth complexity may be two. Pixels only displaying the terrain require touching just once, whereas pixels covered by a tree require two touches, one for the tree and one for the terrain. If the viewer's position is now moved down within the trees with a line of sight towards the horizon, the depth complexity numbers will increase dramatically. This is because in the line of sight, any single pixel might have a hundred trees all lined up with the user's point of view. Thus, if the forest is quite dense, the average depth complexity may go up into the tens, or even higher. Also, the complexity of the synthetic model affects depth complexity. The more complex the environment, the greater the average depth complexity.
In order to reduce the amount of hardware (or time) required to render a synthetic scene, or to enable more complex rendering with the same hardware (or within the same amount of time), a technique is needed which optimizes this synthetic scene rendering process. It would be an advantage if this technique were independent of the modeling process and primitive objects, and instead focused directly on the pixel rendering process.
Many computer graphics rendering systems in use today utilize a brute-force approach to convert modeled primitives into pixels. In other words, each primitive is taken, one at a time, and projected from three dimensional model coordinates (when the synthetic environment is three dimensional) into a two dimensional frame buffer memory. Part of this projection process includes calculating which pixels within the frame buffer are potentially touched by the primitive. Computing which pixels are touched is a process known as scanning. As each pixel is selected by this scanning process, the color of the pixel must be computed. Computing the color can be very complex if sophisticated lighting algorithms and textures are being used. Typical factors contributing to the pixel's color include the following: the modeled color, light sources shining on the primitive, texture, anti-aliasing, and visibility conditions.
In addition to computing the color of the pixel, a means must be provided for determining which primitive in the synthetic scene should be visible for any given pixel (or sub-pixel if anti-aliasing techniques are employed). This process is often referred to as hidden surface removal (i.e. removing all of the surfaces or primitives which are hidden by other surfaces which are closer to the observer). There are some common hidden surface removal techniques. These techniques include the painter's algorithm, list-priority algorithms, scan-line algorithms, and the z-Buffering (or depth buffering) algorithm. Each of these techniques has distinct advantages and disadvantages. Some of these techniques require the database to be rendered in a particular order, while others are order independent.
Unfortunately, rendering synthetic objects in a correct order on a computer display is not a trivial task. Determining which synthetic objects are in the foreground and which are in the background, and then displaying them in the proper order requires numerous calculations.
The painter's algorithm derives it's name from a common method of painting a picture. First, the background is painted, then objects which are less distant are painted, covering the more distant background, until finally the last objects painted are those which are closest in the foreground. Likewise, the painter's algorithm for hidden surface removal uses a similar approach. First, the distance of each object from the viewer is determined, then each object is drawn, beginning with the furthest and working toward the object which is closest in the foreground. While this method solves the problem of hidden surface removal, it raises several problems. First, sorting the primitives into the proper order in which the polygons must be drawn is not a trivial matter. Second, a lot of time is wasted drawing objects which may be largely obscured when the foreground objects are rendered. Third, objects which are inter-penetrating cannot be properly rendered.
An alternative state of the art method for rendering objects on a display is the z-Buffer algorithm. This algorithm determines which points on which polygon primitives are closest to the viewer for every pixel (or sub-pixel) on the computer display. It requires that the programmer set aside extra frame buffer memory to store the z (or depth) value for each pixel (or sub-pixel). Every time a point on the surface of the polygon is drawn into the frame buffer, the z coordinate of that point is placed into this array. If the z coordinate in the buffer is closer than that of the new point, the new pixel is not drawn because that point would be farther away than the old point and is therefore part of a hidden surface. If the z coordinate in the buffer is further than that of the new point, the new pixel is drawn over the old one and the z coordinate of the new point is put in the buffer, replacing the old one.
The z-Buffer algorithm has at least two drawbacks: time and memory. To implement the z-Buffer algorithm, it is necessary to keep track of a z coordinate for each pixel, and then do a comparison and a branch operation for every pixel. A looping procedure that uses so many branching operations is difficult to pipeline by a microprocessor, leaving it much slower than a simple and tight rendering loop.
Nevertheless, the z-Buffer algorithm, has a rather large advantage over just about all other known methods of hidden surface removal. As new polygon primitives are added to the synthetic scene, the amount of time consumed by the algorithm, increase linearly, not exponentially. Therefore, doubling the number of polygon primitives in the polygon list results in the time required to perform the z-Buffer algorithm also doubling (on the average). With other algorithms, the time may quadruple.
Other algorithms also typically require special modeling strategies and unique support structures in order to render the image properly. The z-Buffer approach eliminates most of these constraints, thereby simplifying the modeling process. Furthermore, complex situations such as inter-penetrating objects are correctly handled. Using the z-Buffer algorithm, the visible surface at each pixel is simply the primitive with the closest z value. As each primitive is rendered, this z parameter can be computed for each pixel touched. Along with storing the color of the pixel, the frame buffer is expanded to also store the z depth. As each new primitive is processed, the new z depth can be compared with the one already in the frame buffer. The frame buffer then simply keeps whichever primitive is closest to the observer. FIG. 1 shows a basic flowchart of a z-Buffered system.
One of the major disadvantages of the z buffer is that all of the color shading calculations are performed before the depth test is done, and the pixel may then be discarded by the frame buffer circuit. This required a lot of expensive (or time consuming) calculations to be performed with no final contribution to the image on the screen.
FIG. 1 shows that the relevant steps are as follows. In step 10, a database structure is evaluated. In step 12 geometric transformations are carried out. The next steps 14 and 16 are to accomplish pixel scanning and then pixel shading. Steps 18 and 20 are repeated in a loop to accomplish hidden surface removal while each pixel of the frame buffer is analyzed. When the scene is completely rendered to the frame buffer, it is then moved to an output side of the frame buffer memory where the completed synthetic image appears on the computer display in step 22.
Some of the other state of the art hidden surface removal techniques have enabled more cost effective architectures to be developed. For example, with the list-priority approach, the primitives are rendered in a front-to-back order. By recording which pixels (or array of pixels) are filled up by the primitives as they are rendered, later primitives can be tested against this record in order to not waste time processing the primitive against pixels which are already full. Consequently, fairly simple structures can be built to maintain and test against this full record, thus throwing out pixels before the expensive color shading calculations are performed. Thus, even though the depth complexity of the synthetic scene may be quite high, many of the pixels which would otherwise be thrown away after processing are simply skipped. This list-priory approach is shown in FIG. 2.
In FIG. 2, the first two steps 24 and 26 are the same as steps 10 and 12 of the z-Buffer algorithm. However, step 28 is the step of pixel scanning. Then a loop begins with steps 30, 32 and 34 where the technique carries out hidden surface removal, pixel shading and drawing to the frame buffer, respectively. As regions of the frame buffer are filled, data is fed back to the hidden surface removal section to prevent more distant primitives from being processed further. Finally, step 36 transfers the frame buffer to display memory.
One major disadvantage of the list-priority approach shown in FIG. 2 is that primitives must be modeled in such a way as to guarantee that they can be sorted into priority order. In some cases, this can be extremely difficult to do. This technique, like the painter's algorithm, does not support the notion of inter-penetrating primitives. Accordingly, the synthetic environments for which this technique can be used might be limited.
In general, the various hidden surface removal techniques provide either an efficient rendering architecture at the expense of complex modeling (e.g. the list-priority approach), or they simplify the modeling process at the expense of rendering efficiency (e.g. the z-Buffer algorithm).
Some recent systems have combined the "sort and record" schemes used previously by list-priority machines with the distinct modeling advantages of z-Buffered systems. This approach works well, but it is extremely expensive. First, large database sorting mechanisms are utilized to get the primitives in approximately a front-to-back order. The z-Buffer then performs the final sorting of primitives that may have been too close for the earlier sorting process. The simple full buffer used by the list-priority architectures is replaced with a more complex depth based record. As each array of pixels becomes full (i.e. fully covered by one or more fully opaque primitives), the furthest depth within the array is stored in the full record. Then, as new primitives are about to be rendered, their closest depth is compared with the record for all pixel regions that might need processing. If the new primitive's depth is further than that recorded in the full buffer, then that particular array of pixels need not be rendered.
The database sorting memories and controllers, the minimum and maximum depth calculations, and the depth based full buffer all add substantially to the cost of the system. The advantages gained by such an approach are particularly of value for applications requiring true real-time performance since the rendering load will be much more level than on a system without such capabilities. Without a means to skip filled regions, the rendering load will be directly proportional to the depth complexity of the synthetic scene. By employing these "full schemes," the rendering load is more directly tied to the display's resolution and not so much to the orientation of the database. Unfortunately, this approach to combining a z-Buffer and a full buffer are far too costly for mainstream computer graphics systems.
Consequently, it would be an advantage over the prior art to provide a technique which eliminates the need for large and costly memories and control structures. It would be a further advantage to be able to build the full buffer and associated control logic circuits directly inside custom integrated circuits. Accordingly, it would be an advantage to greatly reduce the cost and complexity of such circuits and thereby make it more affordable for low cost, more mainstream, computer graphics hardware. Therefore, it would be an advantage to provide real-time synthetic scene rendering in computer systems costing much less than in past computer systems.