1. The Field of the Invention
The present invention relates generally to graphical rendering devices and systems. Specifically, the invention relates to devices and systems for conducting highly realistic three-dimensional graphical renderings.
2. The Relevant Art
Graphical rendering involves the conversion of one or more object descriptions to a set of pixels that are displayed on an output device such as a video display or image printer. Object descriptions are generally mathematical representations that model or represent the shape and surface characteristics of the displayed objects. Graphical object descriptions may be created by sampling real world objects and/or by creating computer-generated objects using various editors.
In geometric terms, rendering requires representing or capturing the details of graphical objects from the viewer's perspective to create a two-dimensional scene or projection representing the viewer's perspective in three-dimensional space. The two-dimensional rendering facilitates viewing the scene on a display device or means such as a video monitor or printed page.
A primary objective of object modeling and graphical rendering is realism, i.e., a visually realistic representation that is life-like. Many factors impact realism, including surface detail, lighting effects, display resolution, display rate, and the like. Due to the complexity of real-world scenes, graphical rendering systems are known to have an insatiable thirst for processing power and data throughput. Currently available rendering systems lack the performance necessary to make photo-realistic renderings in real-time.
To increase rendering quality and reduce storage requirements, surface details are often separated from the object shape and are mapped onto the surfaces of the object during rendering. The object descriptions including surface details are typically stored digitally within a computer memory or storage medium and referenced when needed.
One common method of representing three-dimensional objects involves combining simple graphical objects into a more realistic composite model or object. The simple graphical objects, from which composite objects are built, are often referred to as primitives. Examples of primitives include triangles, surface patches such as bezier patches, and voxels.
Voxels are volume elements, typically cubic in shape, that represent a finite, three-dimensional space similar to bitmaps in two-dimensional space. Three-dimensional objects may be represented using a primitive comprising a three-dimensional array of voxels. A voxel object is created by assigning a color and a surface normal to certain voxel locations within the voxel array while marking other locations as transparent.
Voxel objects reduce the geometry bandwidth and processing requirements associated with rendering. For example, objects represented with voxels typically have smaller geometry transform requirements than similar objects constructed from triangles. Despite this advantage, existing voxel rendering algorithms are typically complex and extremely hardware intensive. A fast algorithm for rendering voxel objects with low hardware requirements would reduce the geometry processing and geometry bandwidth requirements of rendering by allowing certain objects to be represented by voxel objectss instead of many small triangles.
As mentioned, rendering involves creating a two-dimensional projection representing the viewer's perspective in a three-dimensional space. One common method of creating a two-dimensional projection involves performing a geometric transform on the primitives that comprise the various graphical objects within a scene. Performing a geometric transform changes any coordinates representing objects from an abstract space known as a world space into actual device coordinates such as screen coordinates.
After a primitive such as a triangle has been transformed to a device coordinate system, pixels are generated for each pixel location which is covered by that primitive. The process of converting graphical objects to pixels is sometimes referred to as rasterization or pixelization. Texture information may be accessed in conjunction with pixelization to determine the color of each of the pixels. Because more than one primitive may be covering any given location, a z-depth for each pixel generated is also calculated, and is used to determine which pixels are visible to the viewer.
FIGS. 1a and 1b depict a simplified example of graphical rendering. Referring to FIG. 1a, a graphical object 100 may be rendered by sampling attributes such as object color, texture, and reflectivity at discrete points on the object. The sampled points correspond to device-oriented regions, typically round or rectangular in shape, known as pixels 102. The distance between the sampled points is referred to herein as a sampling interval 104. The sampled attributes, along with surface orientation (i.e. a surface normal), are used to compute a rendered color 108 for each pixel 102. The rendered colors 108 of the pixels 102 preferably represent what a perspective viewer 106 would see from a particular distance and orientation relative to the graphical object 100.
As mentioned, the attributes collected by sampling the graphical object 100 are used to compute the rendered color 108 for each pixel 102. The rendered color 108 differs from the object color due to shading, lighting, and other effects that change what is seen from the perspective of the viewer 106. The rendered color 108 may also be constrained by the selected rendering device. The rendered color may be represented by a set of numbers 110 designating the intensity of each of the component colors of the selected rendering device, such as red, green, and blue on a video display or cyan, magenta, yellow, and black on an inkjet printer.
As the graphical object 100 is rendered with each frame, the positioning and spacing of the discreet sampling points (i.e., the pixels 102) projected onto the graphical object 100 determine what is seen by the perspective viewer 106. One method of rendering, referred to as ray tracing, involves determining the position of the discreet sampling points by extending a grid 111 of rays 112 from a focal point 114 to find the closest primitive each ray intersects. Since the rays 112 are diverging, the spacing between the rays 112, and therefore the size of the grid 111, increases with increasing distance. Ray tracing, while precise and accurate, is generally not used in real-time rendering systems due to the computational complexity of currently available ray tracing algorithms.
The grid 111, depicted in FIG. 1a, is a set of regularly spaced points corresponding to the pixels 102. The points of the grid 111 lie in an image plane perpendicular to a ray axis 115. The distance of each pixel 102 from a reference plane perpendicular to the ray axis 115, such as the grid 111, is known as the pixel depth or z-depth. The distance or depth of the graphical object 100 changes the level of detail seen by the perspective viewer 106. Relatively distant objects cover a smaller rendering area on the display device, resulting in a reduced number of rays 112 that reach the graphical object 100, and an increased sampling interval 104.
Visual artifacts occur when the spacing between the rays 112 result in the sampling interval 104 being too large to faithfully capture the details of the graphical object 100. A number of methods have been developed to eliminate visual artifacts related to large sampling intervals. One method, known as super-sampling, involves rendering the scene at a higher resolution than the resolution used by the output device, followed by a smoothing or averaging operation to combine multiple rendered pixels into a single output pixel.
Another method, developed to represent objects at various distances and sampling intervals faithfully, involves creating multiple models of a given object. Less detailed models are used when an object is distant, while more detailed models are used when an object is close. Texture information may also be stored at multiple resolutions. During rendering, the texture map appropriate for the distance from the viewer is utilized.
The graphical objects, and portions thereof, that are visible to a viewer are dependent upon the perspective of the viewer. Referring to FIG. 1b, a graphical scene 150 may include a variety of the graphical objects 100, some of which may be visible while others may be obstructed. Unobstructed objects are often designated as foreground objects 100a, while partially obstructed objects may be referred to as background objects 100b. Within the graphical scene 150, completely obstructed objects may be referred to as non-visible objects.
During rendering, the graphical scene 150 is converted to rendered pixels on a rendering device for observance by an actual viewer. Each rendered pixel preferably contains the rendered color 108 such that the actual viewer's visual perception of each graphical object 100 is that of the perspective viewer 106.
A small percentage of the graphical objects 100 may be visible within a particular graphical scene. For example, the room shown within the graphical scene 150 may be one of many rooms within a database containing an entire virtual house. The rendering of non-visible objects and pixels unnecessarily consumes resources such as processing cycles, memory bandwidth, memory storage, and function specific circuitry. Since the relative relationship of graphical objects changes with differing perspectives, for example as the perspective viewer 106 walks through a virtual house, the ability to dynamically determine and prune non-visible objects and pixels improves rendering performance.
Ray casting is a method to determine visible objects and pixels within a graphical scene 150 as shown in FIG. 1a. Ray casting is one method of conducting ray tracing that advances (casts) one ray for each pixel within the graphical scene 150 from the perspective viewer 106. With each cast one or more graphical objects are tested against each ray to see if the ray has “collided” with the object—an extremely processing-intensive procedure.
Z-buffering is another method that is used to determine visible pixels. Pixels are generated from each potentially visible object and stored within a z-buffer. A z-buffer typically stores a depth value and a pixel color value at a memory location corresponding to each x,y position within the graphical scene 150. A pixel color value is overwritten with a new value only if the new pixel depth is less than the depth of the currently stored pixel.
Referring to FIG. 2, a method of rendering known as post z-buffer shading and texturing defers shading and texturing operations within a rendering pipeline 200 and therefore does not texture or shade non-visible pixels. In a typical rendering system, the color of the pixels is calculated prior to z-buffering. In a post z-buffer shading and texturing system, such as the rendering pipeline 200, final color calculations are not performed until after the z-buffering operation. Deferred shading and texturing eliminates the memory lookups and processing operations associated with shading and texturing non-visible pixels and thereby facilitates increased system efficiency.
The rendering pipeline 200 includes a display memory 210 and a graphics engine 220 comprised of a triangle converter 230, a z-buffer 240, and a shading and texturing engine 250. The rendering pipeline 200 also includes a frame buffer 260. In the depicted embodiment, the display memory 210 receives and provides various object descriptors 212 that describe the graphical objects 100.
The display memory 210 preferably contains descriptions of those objects that are potentially visible in the graphical scene 150. With scene changes, the object descriptors 212 may be added or removed from the display memory 210. In some embodiments, the display memory 210 contains a database of the object descriptors 212, for example, a database describing an entire virtual house.
Some amount of simple pruning may be conducted on objects within the display memory 210, for example, by software running on a host processor. Simple pruning may be conducted so that the graphical objects that are easily identified as non-visible are omitted from the rendering process. For example, those graphical objects 100 that are completely behind the perspective viewer 106 may be omitted or removed from the display memory 210.
The graphics engine 220 retrieves the object descriptors 212 from the display memory 210 and presents them to the triangle converter 230. In the depicted embodiment, the object descriptors 212 define the vertices of a triangle or set of triangles and their associated attributes such as the object color. Typically, these attributes are interpolated across the face of the triangle to provide a set of potentially visible pixels 232.
The potentially visible pixels 232 are received by the z-buffer 240 and processed in the manner previously described to provide the visible pixels 242 to the shading and texturing engine 250. The shading and texturing engine 250 textures and/or shades the visible pixels 242 to provide rendered pixels 252 that are collected by the frame buffer 260 to provide one frame of pixels 262. The framed pixels 262 are typically sent to a display system for viewing.
One difficulty in conducting post z-buffer shading and texturing is the increased complexity required of the z-buffer. The z-buffer is required to contain additional information relevant to shading and texturing in addition to the pixel depth. The z-buffer is often a performance critical element, in that each pixel is potentially updated multiple times, requiring increased bandwidth. The increased size and bandwidth requirements on the z-buffer have limited the use of post z-buffer shading and texturing within graphical systems.
One prior art method to reduce the size of the z-buffer is shown in FIG. 3. The method divides a screen 300 into tiles 310. The tiles 310 and the screen 300 consist of a plurality of scanlines 320. Each tile 310 is rendered as if it were the entire screen 300, thus requiring a tile-sized z-buffer. While a tile-sized z-buffer requires less memory, a tile-sized z-buffer increases complexity related to sorting, storing, accessing, and rendering the object descriptors 212 within the display memory 210. The increased complexity results from objects that overlap more than one tile.
While many advances have been made to graphical rendering algorithms and architectures, including those depicted in the graphical pipeline 200, real-time rendering of photo-realistic life-like scenes requires the ability to render greater geometric detail than is sustainable on currently available graphical rendering systems.
Therefore, what is generally needed are methods and apparatus to conduct efficient graphical rendering. Specifically, what is needed is a graphical system that renders voxel primitives efficiently. The ability to render voxel objects efficiently increases the detail achievable in real-time graphical rendering systems.
What is also needed is a graphical system that renders very detailed scenes with extensive depth complexity, without tying up external memory interfaces with z-buffer data traffic. A z-buffering apparatus and method that facilitates large tiles, supports a high pixel throughput, is compact enough to reside entirely on-chip, and reduces external memory bandwidth requirements would facilitate such a system.
In addition to better z-buffering, a method and apparatus are needed that reduce the bandwidth load on the z-buffer. Specifically, what is needed is a method and apparatus that reduces the generation of non-visible pixels prior to z-buffering.
In addition to more intelligent pixel generation, rendering highly realistic scenes requires accessing large amounts of texture and world description data. Specifically, what is needed is an apparatus and method to maximize the efficiency of internal and external memory accesses. Such a method and apparatus would preferably achieve increased realism by facilitating larger stores of texture data within low-cost external memories, while maintaining a high data throughput within the rendering pipeline.
Lastly, what is needed is a graphical processing architecture that facilitates combining the various elements of the present invention into an efficient rendering pipeline that is scalable in performance.