This application relates to computer graphics systems, and more specifically to 3D graphics rendering hardware and techniques.
3D Graphics
Three-dimensional graphics (3D graphics) refers to the practice of presenting a scene or image on a two-dimensional screen in such a way that it appears three dimensional. To do so, very much care must be taken to accurately display surface textures, lighting, shadowing, and other characteristics. Displaying a 3D graphics image is much more difficult than displaying a traditional 2D image.
3D Graphics Requirements
3D graphics takes a great deal of computer processing power and memory. One of the performance measures for 3D games is frame rate, expressed in frames-per-second (fps), meaning the number of times each second an image can be redrawn to convey a sense of motion.
3D Graphics Concepts
3D graphics are spatial data represented in polygonal form with an associated set of characteristics, such as light, color, shade, texture, etc. The 3D graphics pipeline consists of two major stages, or subsystems, referred to as geometry and rendering. The geometry stage is responsible for managing all polygon activities and for converting 3D spatial data into pixels. The rendering stage is responsible for managing all memory and pixel activities. It renders data from the geometry stage into the final composition of the 3D image for painting on the CRT screen.
Before consulting how a scene is broken down to allow the computer to reconstruct it, one has to start with a scene which consists of shapes. The modeling process creates this information. Designers use specialized 3D graphics software tools, such as 3D Studio, to build polygonal models destined to be manipulated by computer.
3D Graphics Pipeline
The first stage of the pipeline involves translating the model from its native coordinate space, or model coordinates, to the coordinate space of the application, or world coordinates. At this stage, the separate and perhaps unrelated coordinate systems defining objects in a scene are combined in a single coordinate system referred to as world space (World Space Co-ordinates). Translating objects into world space may involve clipping, or discarding elements of an object that fall outside the viewport or display window.
Interactive 3D graphics seeks to convey an illusion of movement by changing the scene in response to the user's input. The technical term for changing the database of geometry that defines objects in a scene is transformation. The operations involve moving an object in the X, Y, or Z direction, rotating it in relation to the viewer (camera), or scaling it to change the size. (The X coordinate is moving left to right; Y is moving from top to bottom; Z is moving from "in front" to behind.)
When any change in the orientation or position of the camera is desired, every object in a scene must be recalculated relative to the new view. As can be imagined, a fast-paced game needing to maintain a high frame rate will demand a great deal of geometry processing. As scene complexity increases (more polygons) the computational requirements increase as well.
The setup stage is the point in the 3D pipeline where the host CPU typically hands off processing tasks to the hardware accelerator. Setup is the last stage before rasterization, or drawing, and can consume considerable processing time. The computational demands of the setup process depend on the number of parameters required to define each polygon as well as the needs of the pixel drawing engine.
The Rendering Subsystem: Pixel Drawing
While the geometry stages of the 3D pipeline are traditionally left to the host CPU with its powerful computational capabilities, the actual drawing of pixels to the 2D display is called rendering. Rendering is best performed by specialized hardware or the pixel engine, also called the 3D hardware accelerator. At the top of the 3D graphics pipeline, the bottleneck is how fast the calculations can be performed. At the rendering stage the bottleneck is memory access--how fast the pixel reads and writes to the frame buffer (display memory)--and other special purpose memory blocks can be performed. The renderer must be able to process thousands of polygons for each frame which, as mentioned above, must further be updated many times each second in order to sustain an illusion of motion.
Texture Mapping
There are a couple of different ways to add complexity to a 3D scene. Creating more and more detailed models, consisting of a greater number of polygons, is one way to add visual interest to a scene. However, adding polygons necessitates paying the price of having to manipulate more geometry. 3D systems have what is known as a "polygon budget," an approximate number of polygons that can be manipulated without unacceptable performance degradation. In general, fewer polygons yield higher frame rates.
Another important technique that is regularly used to make a scene more appealing is called texture mapping. A texture map is a pattern or picture that is applied to the surface of an object, just like wallpaper is stuck to a wall. Motion video can even be used as a texture map in multimedia 3D. Texture mapping is very important because satisfying environments can be created with few polygons when those polygons are nicely decorated.
Awkward side-effects of texture mapping occur unless the renderer can apply texture maps with correct perspective. Perspective-corrected texture mapping involves an algorithm that translates texels, or pixels from the bitmap texture image, into display pixels in accordance with the spatial orientation of the surface. An efficient 3D system needs to dedicate memory to the storage of texture maps, or bitmaps to be used as textures.
Bilinear Filtering
Texture mapping is used so prevalently that several additional techniques to enhance its effect are often built into a renderer. Bilinear filtering improves the appearance of texture mapping surfaces by considering the values of four adjacent texels in order to determine the value of the displayed pixel. When drawing a given pixel on the edge of an object, for example, the bilinear filtering process will conventionally use the weighted average of each of the RGB values of the four neighboring pixels to compute the value for the given pixel. Therefore, instead of a left-to-right sequence of pixels at the edge of an object which progress as red, red, white, for example, the sequence might filter to be red, pink (half red, half white), white. In this case, the edge of the object would have passed through the center of the second pixel. The pixel is the smallest color unit, and it cannot actually be red on the left half and white on the right half, so pink is used instead. When viewed, the eye interprets the object as intended, with a smooth edge passing through the second pixel.
This process serves to give the appearance of much smoother color transitions between pixels. Even when the object to be displayed should show relatively sharp edges in the final image, as it might have in the example above, the bilinear filtering process will give a remarkable improvement in the final image, by allowing the viewer to see what appears to be sub-pixel resolutions when an edge should only cover part of a pixel, and to smooth transitions between texels in a texture map. This is particularly important when a texture is magnified.
Bilinear filtering avoids the `blockiness` that results from simple point sampling where adjacent display pixel values may be defined by a single texel. Point sampling requires much less memory bandwidth than bilinear filtering and is generally faster, but leaves distracting artifacts at the edges of objects.
Mip Mapping
Mip mapping involves storing multiple copies of texture maps (generally two or three), digitized at different resolutions. When a texture mapped polygon is smaller than the texture image itself, undesirable effects result. Mip mapping can provide a large version of a texture map for use when the object is close to the viewer, and a small version of the texture map for use when the object shrinks from view. Trilinear filtering is frequently employed to smooth out edges of mip mapped polygons and prevent moving objects from displaying a distracting `sparkle` resulting from mismatched texture intersections.
Alpha Blending
Alpha blending is a technique that controls the transparency of an object, allowing realistic rendering of translucent surfaces such as glass or water. For this purpose, an additional value, called the "alpha value," is included with the Red-Green-Blue color data of each pixel. Thus, the color data is often referred to as the RGBA data. The RGBA data for each pixel therefore includes both all the color data for the pixel as well as an alpha value indicating the transparency or blending factor. Additional atmospheric effects that are found in rendering engines include fogging and depth cuing. Both of these techniques obscure an object as it moves away from the viewer. The fog effect blends the color data for the pixel with the fog color, which is usually white; the degree of blending is related to the distance from the eye of the object being drawn. Similarly, depth cuing blends the pixel color with the depth cue color, usually black, depending on the distance from the eye.
Antialiasing
One common problem, inherent in a raster display system, is that of jagged or "aliased" edges. Aliasing is especially disconcerting at the edges of texture maps. Antialiasing or minimizing the appearance of jagged edges is important to implement in order to avoid this distraction. The effect is accomplished by reducing the contrast between the edge of an object and the color behind it by adjusting pixel values at the edge.
Double Buffering
All of the preceding calculations and rendering steps must occur on hundreds to thousands of polygons for each frame of an interactive program that needs to update the display at a rate of between 15 and 30 times each second. Double buffering gives the system a little breathing room by providing an opportunity to render the next frame of a sequence into off-screen memory. The off-screen memory is then switched to the display, while the memory containing the formerly displayed frame can be cleared and re-painted with the next frame to be displayed and so on. Display systems that lack a double buffer capability may present distracting transitional artifacts to the viewer.
Image Copying and Scaling
One common operation in computer graphics is to copy a rectangular image to the screen, but only draw certain parts of it. For example, a texture image may be stored on an otherwise blank page; when the texture image is desired to be inserted into a display, the blank background page is obviously unneeded. The parts of the source image not to be copied are defined by setting them to a specific color, called the "key" color. During the copy, a test is made for the existence of this key color, and any pixels of this key color are rejected and therefore not copied. This technique allows an image of any shape to be copied onto a background, since the unwanted pixels are automatically excluded. For example, this could be used to show an explosion, where the flames are represented by an image.
As the explosion continues, or as the viewer moves closer to it, its size increases. This effect is produced by scaling the image during the copy. Magnifying the image produces unwanted side effects, however, and the final image may appear blocky and unconvincing. An example of this technique is shown in FIGS. 1A and 1B. In these figures, the gray area represents the desired image, and the black area represents the key color. FIG. 1A shows the original texture, and FIG. 1B shows the same image copied and scaled. Note that the unwanted key color area has been removed cleanly, but the staircase effect on the edge is magnified. When a texture has more than one color on the interior of the object, as is usually the case, the interior of the scaled texture will also be blocky and unattractive, since there will be no smooth transition between blocks of different color.
The normal way to deal with this is to bilinear-filter the image during the copy so that pixels in the source image are blended with their neighbors to remove the blocky effect. As described above, this procedure blends the color of a given pixel with the colors of that pixel's nearest neighbors, to produce a smoother image overall. This works within the valid parts of the image, but leaves extremely blocky edges. FIGS. 1C and 1D show an original texture, and same texture after it has been filtered, copied, and scaled, respectively. Note that in this case, the cut out edge is as blocky as the original example, but in addition the edge pixels have the black (key color) background blended with the correct color, giving a dark border.
There are three primary artifacts, or defects in the resulting image, caused by bilinear filtering and magnification of the image during copy. Each of these defects reduce the quality of the resultant image, but are typically unavoidable in present systems.