Three-dimensional Computer Graphics
Computer graphics is the art and science of generating pictures with a computer. Generation of pictures is commonly called rendering. Generally, in three-dimensional computer graphics, geometry that represents surfaces (or volumes) of objects in a scene is translated into pixels stored in a frame buffer, and then displayed on a display device, such as a CRT.
Pixels may have a direct one-to-one correspondence with physical display device hardware, but this is not always the case. Some three-dimensional graphics systems reduce aliasing with a frame buffer that has multiple pixels per physical display picture element. Other 3D graphics systems, in order to reduce the rendering task, have multiple physical display picture elements per pixel. In this document, "pixel" refers to the smallest individually controllable element in the frame buffer, independent of the physical display device. The display screen 100 is defined as the two-dimensional array of pixels which makes a picture. Display screens 100 can be almost any size. This document uses, as a numerical example for various pixel organizations, a very small display screen 100 of 120.times.80 pixels.
When a piece of 3D geometry is projected onto a display screen 100, it affects a set of pixels in the Frame Buffer 1012. In the context of a particular piece of geometry, the term "pixel" is used to describe one small portion of the projected piece of geometry which has a one-to-one correspondence with a pixel in the display screen 100.
A summary of the rendering process can be found in: "Fundamentals of Three-dimensional Computer Graphics", by Watt, Chapter 5: The Rendering Process, pages 97 to 113, published by Addison-Wesley Publishing Company, Reading, Mass., 1989, reprinted 1991, ISBN 0-201-15442-0 (hereinafter referred to as the Watt Reference).
An example of a hardware renderer is incorporated herein by reference: "Leo: A System for Cost Effective 3D Shaded Graphics", by Deering and Nelson, pages 101 to 108 of SIGGRAPH 93 Proceedings, 1-6 Aug. 1993, Computer Graphics Proceedings, Annual Conference Series, published by ACM SIGGRAPH, New York, 1993, Softcover ISBN 0-201-58889-7 and CD-ROM ISBN 0-201-56997-3 (hereinafter referred to as the Deering Reference). The Deering Reference describes a generic 3D graphics pipeline (i.e., a renderer, or a rendering system) as "truly generic, as at the top level nearly every commercial 3D graphics accelerator fits this abstraction", and this pipeline diagram is reproduced here as FIG. 1. Such pipeline diagrams convey the process of rendering, but do not describe any particular hardware. The Generic 3D Graphics Pipeline 1000 has two sections highlighted, the floating-point intensive functions 1020 and the drawing intensive functions performed by a Pixel Drawing Pipeline 4000. In this document, the term "pixel drawing pipeline" refers to a subset of a 3D graphics pipeline, and it includes everything after the screen space conversion 1003 step up to and including the Z-buffered blend 1010 step. The Pixel Drawing Pipeline 4000 method is shown in a flow diagram in FIG. 4.
The Pixel Drawing Pipeline 4000 in FIG. 1 is implemented in hardware by a Pixel Drawing Subsystem 2002, and a simple block diagram is shown in FIG. 2. The Pixel Drawing Subsystem 2002 includes a conventional, prior art, Z-buffer 2008 and a Prior Art Rasterize Processor 2010. As defined here, the Prior Art Rasterize Processor 2010 performs: 1) set up for incremental render 1004; 2) edge walking 1006; 3) span interpolation 1008; and 4) Z-buffered blend 1010. Some manufacturers do not include the set up for incremental render 1004 as part of the Prior Art Rasterize Processor 2010. But, it 1004 is included here because the present invention adds a new step before the set up for incremental render 1004 step, and both steps are included in the novel pixel drawing pipelines presented here. The Prior Art Rasterize Processor 2010 performs the Z-buffered blend 1010 by accessing the Z-buffer 2008 over a bus, labelled ZValues 2006 in FIG. 2. As the Prior Art Rasterize Processor 2010 generates new pixel color values, they are written into the frame buffer 1012 by utilizing the busses PixelColor 2012 and PixelAddr 2016.
In computer graphics, each renderable object generally has its own local object coordinate system, and therefore needs to be translated from object coordinates to pixel display coordinates. Conceptually, this is a 4-step process: 1) translation (including scaling for size enlargement or shrink) from object coordinates to world coordinates, which is the coordinate system for the entire scene; 2) translation from world coordinates to eye coordinates, based on the viewing point of the scene; 3) translation from eye coordinates to perspective translated eye coordinates, where perspective scaling (farther objects appear smaller) has been performed; and 4)translation from perspective translated eye coordinates to pixel coordinates, also called screen coordinates. These translation steps can be compressed into one or two steps by precomputing appropriate translation matrices before any translation occurs.
FIG. 3 shows a three-dimensional object, a tetrahedron, with its own coordinate axes (x.sub.obj, y.sub.obj, z.sub.obj). The three-dimensional object 3010 is translated, scaled, and placed in the viewing point's 3030 coordinate system based on (x.sub.eye, y.sub.eye, z.sub.eye). The object 3020 is projected onto the viewing plane, thereby correcting for perspective. At this point, the object appears to have become two-dimensional; however, the object's z-coordinates are preserved so they can be used later for hidden surface removal techniques. The object is finally translated to screen coordinates, based on (x.sub.screen, y.sub.screen, z.sub.screen), where z.sub.screen is going perpendicularly into the page. Points on the object now have their x and y coordinates described by pixel location within the display screen and their z coordinates in a scaled version of distance from the viewing point.
Once the geometry is in screen coordinates, it is rasterized, which is the process of generating actual pixel color values. Many techniques are used for generating pixel color values, including Gouraud shading, Phong shading, and texture mapping. In some systems, the Frame Buffer 1012 is augmented with an A-buffer which is used to reduce aliasing. The A-buffer reduces aliasing by keeping track of the percentage coverage of a pixel by a rendered object, which is relevant for edges of projected objects. Hereinafter, the A-buffer will not be explicitly discussed, but is assumed to be optionally included in any rendering system described herein.
Because many different portions of geometry can affect the same pixel, the geometry representing the surfaces closest to the scene viewing point must be determined. Thus, for each pixel, the closest surface to the viewing point determines the pixel color value, and the other more distant surfaces which could affect the pixel are hidden and are prevented from affecting the pixel. An exception to this rule occurs when non-opaque surfaces are rendered, in which case all non-opaque surfaces closer to the viewing point than the closest opaque surface affect the pixel color value, while all other non-opaque surfaces are discarded. In this document, the term "occulted" is used to describe geometry which is 100% hidden by other non-opaque geometry.
As a rendering process proceeds, the renderer must often recompute the color value of a given screen pixel multiple times, because there may be many surfaces that intersect the volume subtended by the pixel. The average number of times a pixel needs to be rendered, for a particular scene, is called the depth complexity of the scene. Simple scenes have a depth complexity near unity, while complex scenes can have a depth complexity of ten or twenty. As scene models become more and more complicated, renderers will be required to process scenes of ever increasing depth complexity.
Many techniques have been developed to perform visible surface determination, and a survey of these techniques are incorporated herein by reference to: "Computer Graphics: Principles and Practice", by Foley, van Dam, Feiner, and Hughes, Chapter 15: Visible-Surface Determination, pages 649 to 720, 2nd edition published by Addison-Wesley Publishing Company, Reading, Mass., 1990, reprinted with corrections 1991, ISBN 0-201-12110-7 (hereinafter referred to as the Foley Reference).
When a point on a surface (frequently a polygon vertex) is translated to screen coordinates, the point has three coordinates: 1) the x-coordinate of the affected pixel; 2) the y-coordinate of the affected pixel; and 3) the z-coordinate of the point in either eye coordinates, distance from the virtual screen, or some other coordinate system which preserves the relative distance of surfaces from the viewing point. In this document, positive z-coordinate values are used for the "look direction" from the viewing point, and smaller values indicate a position closer to the viewing point.
For example, if a surface is approximated by a set of planar polygons, the vertices of each polygon are translated to screen coordinates. For points in or on the polygon (other than the vertices), the screen coordinates are interpolated from the coordinates of vertices, typically by the processes of edge walking and span interpolation, as discussed in the Deering Reference. Thus, a z-coordinate value is included in each pixel value (along with the color value) as geometry is rendered.
The most common method for visible surface determination, or conversely, for hidden surface removal, is the Z-buffer. Another common hidden surface removal technique is called backface culling (see Foley Reference, page 663), which eliminates polygons from rendering before they are converted into pixels. Backface culling is generally included in the face determination 1003 step of the graphics pipeline 1000, and therefore occurs before (and is, therefore, complementary to) subsequent hidden surface removal steps.
Z-buffers
Stated simply, the Z-buffer stores, for every pixel, the z-coordinate of the pixel within the closest geometry (to the viewing point) that affects the pixel. Hence, as new pixel values are generated, each new pixel's z-coordinate is compared to the corresponding location in the Z-buffer. If the new pixel's z-coordinate is smaller (i.e., closer to the viewing point), this value is stored into the Z-buffer and the new pixel's color value is written into the frame buffer. If the new pixel's z-coordinate is larger (i.e., farther from the viewing point), the frame buffer and Z-buffer values are unchanged and the new pixel is discarded. Method pseudocode for the Z-buffer method is shown in Appendix 1, which is a slightly modified version of FIG. 15.21 in the Foley Reference. The pixel loop A1006-A1013 is performed for every pixel in each polygon.
A flow diagram of the prior art Z-buffer method is shown in FIG. 4. This figure highlights the portion of the method, called the Pixel Drawing Pipeline method 4000, which rasterizes the polygon. In this document, rasterization refers to the process of converting a piece of renderable geometry into individual pixels.
One drawback to the Z-buffer hidden surface removal method is the requirement for geometry to be converted to pixel values before hidden surface removal can be done. This is because the keep/discard decision is made on a pixel-by-pixel basis, rather than at a higher level, such as at the level of the geometry in screen coordinates, which is accomplished by the present invention.
Prior art Z-buffers are based on conventional Random Access Memory (RAM) or Video RAM (VRAM). High performance prior art Z-buffers employ many different techniques, such as page-mode addressing and bank interleaving, to interrogate as many Z-buffer memory locations per second as possible. The interrogation process is needed to perform the keep/discard decision on a pixel-by-pixel basis as geometry is rasterized. One major drawback to the prior art Z-buffer is its inherently pixel-sequential nature. For scenes with high depth complexity, access to the Z-buffer is a bottleneck which limits performance in renderers.
Temporal Correlation
Many applications of 3D computer graphics generate a sequence of scenes in a frame-by-frame manner. If the frame rate of the sequence is sufficiently high (this is generally the case), then the present scene looks very much like the previous scene, and the only differences are due to movement of objects or light sources within the scene or movement of the viewing point. Thus, consecutive scenes are similar to each other due to their temporal correlation.
Identifying the non-occulted geometry from the previous scene can help with the rendering of the present scene because such non-occulted geometry can be rendered first. Then, when geometry which was occulted in the previous scene undergoes hidden surface removal, most of it can be discarded before pixel color computations need to be done.
Prior art rendering systems do not gain much from taking advantage of temporal correlation because they will only save computations at the very end of the graphics pipeline 1000. Namely, they will save the pixel color computation within the span interpolation step 1008 of the pipeline 1000. This savings is minor because the pixel-by-pixel nature of the Z-buffer hidden surface removal technique requires geometry to be converted to separate pixels before the keep/discard decision can be made. Also, the minor savings is mostly eliminated if the pixel color computation is performed in parallel (by different hardware) with Z-buffer hidden surface removal computation.
On top of this, taking advantage of temporal correlation is difficult in prior art rendering systems because, the "backward link" from the final values in the Z-buffer and frame buffer back to the geometry database is difficult to construct. In other words, prior art rendering systems smash geometry into separate and independent pixels, and taking advantage of temporal correlation requires knowing which pieces of geometry generated the pixels which survived the keep/discard decisions when an entire scene has completed the rendering process.
Geometry Databases
The geometry needed to generate a renderable scene is stored in a database. This geometry database can be a simple display list of graphics primitives or a hierarchically organized data structure. In the hierarchically organized geometry database, the root of the hierarchy is entire database, and the first layer of subnodes in the data structure is generally all the objects in the "world" which can be seen from the viewpoint. Each object, in turn, contains subobjects, which, in turn, contain subsubobjects; thus resulting in a hierarchical "tree" of objects. Hereinafter, the term "object" shall refer to any node in the hierarchial tree of objects. Thus, each subobject is an object. The term "root object" shall refer to a node in the first layer of subnodes in the data structure. Hence, the hierarchical database for a scene starts with the scene root node, and the first layer of objects are root objects.
Hierarchical databases of this type are used by the Programmer's Hierarchical Interactive System (PHIGS) and PHIGS PLUS standards An explanation of these standards can be found in the book, "A Practical Introduction to PHIGS and PHIGS PLUS", by T. L. J. Howard, et. al., published by Addison-Wesley Publishing Company, 1991, ISBN 0-201-41641-7 (incorporated herein by reference and hereinafter called the Howard Reference). The Howard Reference describes the hierarchical nature of 3D models and their data structure on pages 5 through 8.
Content Addressable Memories
Most Content Addressable Memories (CAM) perform a bit-for-bit equality test between an input vector and each of the data words stored in the CAM. This type of CAM frequently provides masking of bit positions in order to eliminate the corresponding bit in all words from affecting the equality test. It is inefficient to perform magnitude comparisons in a equality-testing CAM because a large number of clock cycles is required to do the task.
CAMs are presently used in translation look-aside buffers within a virtual memory systems in some computers. CAMs are also used to match addresses in high speed computer networks. CAMs are not used in any practical prior art renders.
Magnitude Comparison CAM (MCCAM) is defined here as any CAM where the stored data are treated as numbers, and arithmetic magnitude comparisons (i.e. less-than, greater-than, less-than-or-equal-to, etc.) are performed in parallel. This is in contrast to ordinary CAM which treats stored data strictly as bit vectors, not as numbers. An MCCAM patent, included herein by reference, is U.S. Pat. No. 4,996,666, by Jerome F. Duluk Jr., entitled "Content-Addressable Memory System Capable of Fully Parallel Magnitude Comparisons", granted Feb. 26, 1991 (hereinafter referred to as the Duluk Patent). Structures within the Duluk Patent specifically referenced shall include the prefix "Duluk Patent", e.g. "Duluk Patent MCCAM Bit Circuit". MCCAMs are not used in any prior art renderer.
The basic internal structure of an MCCAM is a set of memory bits organized into words, where each word can perform one or more arithmetic magnitude comparisons between the stored data and input data.