With the widespread use of computers in all aspects of modem life, there is an increasing demand to improve the human-machine interface through the use of visual information. Advances in graphical software and hardware have already improved the human-machine interface drastically. Interactive graphics such as windowing environments for desk-top computers, for example, have improved the ease of use and interactivity of computers drastically and are commonplace today. As the price-performance ratio of hardware drops, the use of computer generated graphics and animation will become even more pervasive. Unfortunately, the cost of producing truly interactive and realistic effects has limited its application. There is a need, therefore, for new graphics processing techniques and architectures that provide more interactive and realistic effects at a lower cost.
Although there are numerous ways to categorize graphics processing, one common approach is to describe an image in terms of the dimensions of the objects that it seeks to represent. For example, a graphics system may represent objects in two dimensions (e.g., having x and y coordinates); in which case the graphics are said to be "two-dimensional", and three dimensions (e.g., having x, y, and z coordinates), in which case the graphics are said to be "three-dimensional" ("3-D").
Since display devices such as cathode ray tubes (CRTs) are two-dimensional ("2-D"), the images displayed by computer graphic systems are generally 2-D. As discussed in greater detail below, however, if the computer maintains a graphical model representing the imaged object in three-dimensional space, the computer can alter the displayed image to illustrate a different perspective of the object in 3-D space. In contrast, although a 2-D graphic image can be transformed prior to display (e.g., scaled, translated, or rotated), the computer can not readily depict the object's appearance from a different perspective in 3-D space.
The increasing ability of modern computers to efficiently handle 2-D and, particularly, 3-D graphics has resulted in a growing variety of applications for computers, as well as fundamental changes in the interface (UI) between computers and their users. The availability of 3-D graphics is becoming increasingly important to the growth of entertainment related applications including production quality film animation tools, as well as lower resolution games and multimedia products for the home. A few of the many other areas touched by 3-D graphics include education, video conferencing, video editing, interactive user interfaces, computer-aided design and computer-aided manufacturing (CAD/CAM), scientific and medical imaging, business applications, and electronic publishing.
A graphics processing system may be thought of as including an application model, application program, graphics subsystem, as well as the conventional hardware and software components of a computer and its peripherals.
The application model represents the data or objects to be displayed, assuming of course that the image processing is based upon a model. The model includes information concerning primitives such as points, lines, and polygons that define the objects' shapes, as well as the attributes of the objects (e.g., color). The application program controls inputs to, and outputs from, the application model--effectively acting as a translator between the application model and graphics sub-system. Finally, the graphics sub-system is responsible for passing user inputs to the application model and is responsible for producing the image from the detailed descriptions stored by the application model.
The typical graphics processing system includes a physical output device which is responsible for the output or display of the images. Although other forms of display devices have been developed, the predominant technology today is referred to as raster graphics. A raster display device includes an array of individual points or picture elements (i.e., pixels), arranged in rows and columns, to produce the image. In a CRT, these pixels correspond to a phosphor array provided on the glass faceplate of the CRT. The emission of light from each phosphor in the array is independently controlled by an electron beam that "scans" the array sequentially, one row at a time, in response to stored information representative of each pixel in the image. Interleaved scanning of alternate rows of the array is also a common technique in, for example, the television environment. The array of pixel values that map to the screen is often referred to as a bitmap or pixmap.
One problem associated with raster graphics devices is the memory required to store the bitmap for even a single image. For example, the system may require 3.75 megabytes (Mb) of random access memory to support a display resolution of 1280.times.1024 (i.e., number of pixel columns and rows) and 24 bits of color information per pixel. This information, which again represents the image of a single screen, is stored in a portion of the computer's display memory known as a frame buffer.
Another problem with conventional raster graphics devices such as CRTs is the relatively quick decay of light emitted by the device. As a result, the display must typically be "refreshed" (i.e., the raster rescanned) at a rate approaching 60 Hz or more to avoid "flickering" of the image. This places a rigorous demand on the image generation system to supply image data at a fixed rate. Some systems address this problem by employing two frame buffers, with one of the buffers being updated with pixmap information corresponding to subsequent image frame, while the other buffer is being used to refresh the screen with the pixmap for the current image frame.
The demands placed upon the system are further exacerbated by the complexity of the information that often must be processed to render an image from the object stored by the application model. For example, the modeling of a three-dimensional surface is, in itself, a complex task. Surface modeling is performed by the application model and may involve the use of polygon meshes, parametric surfaces, or quadric surfaces. While a curved surface can be represented by a mesh of planar polygons, the "smoothness" of its appearance in the rendered image will depend both upon the resolution of the display and the number of individual polygons that are used to model the surface. The computations associated with high resolution modeling of complex surfaces based upon polygon meshes can be extremely resource intensive.
As intimated above, there is a demand to produce more realistic and interactive images. The term, "real-time," is commonly used to describe interactive and realistic image processing systems. In a "real-time" system, the user should perceive a continuous motion of objects in a scene. In a video game having real-time capabilities, the active characters and view point should respond with minimal delay to a user's inputs, and should move smoothly.
To produce such real-time effects, an image rendering system has to generate a new image at a sufficiently high rate such that the user perceives continuous motion of objects in a scene. The rate at which a new image is computed for display is referred to as the "computational" rate or the "computational frame" rate. The computational rate needed to achieve realistic effects can vary depending on how quickly objects move about the scene and how rapidly the viewing perspective changes. For a typical application, a real-time graphics system recomputes a new image at least twelve times a second to generate a series of images that simulate continuous motion. For high-quality animation applications, however, the computational rate must be significantly higher.
Another critical issue for real-time systems is transport delay. Transport delay is the time required to compute and display an image in response to input from the user, i.e. motion of a joystick to move a character in a scene. To the extent transport delay time is noticeable to a user, "real-time" interactivity is impaired. Ideally, the user should not perceive any transport delay. However, in practice there is always some delay attributed to rendering objects in a scene in response to new inputs and generating a display image. Improvements in real-time interactivity are highly desirable without discarding data, which can interfere with image quality.
As introduced above, conventional graphics systems typically include a frame buffer. To generate an image, the graphic system renders all of the objects in a scene and stores the resulting image in this frame buffer. The system then transfers the rendered image data to a display. In a conventional graphics architecture, the entire frame buffer is erased and the scene is re-rendered to create a next frame's image. In this type of system, every object must be redrawn for each frame because the frame buffer is cleared between frames. Every object therefore is updated at the same rate, regardless of its actual motion in the scene or its importance to the particular application.
This conventional architecture presents several hurdles to producing highly realistic and interactive graphics. First, every object in a scene for a particular frame is rendered with the same priority at the same update rate. As such, objects in the background that have little detail and are not moving are re-rendered at the same rate as objects in the foreground that are moving more rapidly and have more surface detail. As a result, processing and memory resources are consumed in re-rendering background objects even though these background objects do not change significantly from frame to frame.
Another drawback in this conventional architecture is that every object in the scene is rendered at the same resolution. In effect, the rendering resources consumed in this type of approach are related to the size of the screen area that the object occupies rather than the importance of the object to the overall scene. An example will help illustrate this problem. In a typical video game, there are active characters in the foreground that can change every frame, and a background that rarely changes from frame to frame. The cost in terms of memory usage for generating the background is much greater than generating the active characters because the background takes up more area on the screen. Image data must be stored for each pixel location that the background objects cover. For the smaller, active characters however, pixel data is generated and saved for only the pixels covered by the smaller characters. As a result, the background occupies more memory even though it has lesser importance in the scene. Moreover, in a conventional architecture the entire background has to be re-rendered for every frame, consuming valuable processing resources.
One principal strength of the frame buffer approach is that it can be used to build an arbitrary image on an output device with an arbitrary number of primitive objects, subject only to the limit of spatial and intensity resolution of the output device. However, there are several weakness for a graphics system using a frame buffer.
A frame buffer uses a large amount (e.g. 64-128 Mb) of expensive memory. Normal random access memory (RAM) is not adequate for frame buffers because of its slow access speeds. For example, clearing the million pixels on a 1024.times.1024 screen takes 1/4 of a second assuming each memory cycle requires 250 nanoseconds. Therefore, higher speed, and more expensive video RAM (VRAM), or dynamic RAM (DRAM) is typically used for frame buffers. High-performance systems often contain two expensive frame buffers: one frame buffer is used to display the current frame, while the other is used to render the next frame. This large amount of specialized memory dramatically increases the cost of the graphics system.
Memory bandwidth for frame buffers is also a problem. To support processing a graphics image with texturing, color, and depth information stored for each pixel requires a bandwidth of about 1.7 Gigabytes-per-second for processing an image at 30 Hz. Since a typical DRAM has a bandwidth of 50 Mb-per-second, a frame buffer must be built from a large number of DRAMs which are processed with parallel processing techniques to accomplish the desired bandwidth.
To achieve real-time, interactive effects, high-end graphics systems use parallel rendering engines. Three basic parallel strategies have been developed to handle the problems with large frame buffer: (1) pipelining the rendering process over multiple processors; (2) dividing frame buffer memory into groups of memory chips each with its own processor; and (3) combining processing circuitry on the frame buffer memory chips with dense memory circuits. These techniques have improved the processing of graphics systems using large frame buffers, but have also dramatically increased the cost of these systems.
Even with expensive parallel processing techniques, it is very difficult to support sophisticated anti-aliasing technique. Anti-aliasing refers to processes for reducing artifacts in a rendered image caused by representing continuous surfaces with discrete pixels. In typical frame buffer architectures, pixel values for an entire frame are computed in arbitrary order. Therefore, to perform sophisticated anti-aliasing, pixel data must be generated for the entire frame before anti-aliasing can begin. In a real-time system, there is not enough time to perform anti-aliasing on the pixel data without incurring additional transport delay. Moreover, anti-aliasing requires additional memory to store pixel fragments. Since a frame buffer already includes a large amount of expensive memory, the additional specialized memory needed to support anti-aliasing makes the frame buffer system even more expensive.
Image compression techniques also cannot be easily used on a graphic system using a frame buffer during image processing. The parallel processing techniques used to accelerate processing in a graphics system with a frame buffer cause hurdles for incorporating compression techniques. During parallel processing, any portion of the frame buffer can be accessed at random at any instance of time. Most image compression techniques require that image data not change during the compression processing so the image data can be decompressed at a later time.
In frame buffer architectures the expensive memory and parallel processing hardware is always under-utilized because only a small fraction of the frame buffer memory or parallel processing units are actively being used at any point in time. Thus, even though a frame buffer architecture includes a large amount of expensive memory and processing hardware, this hardware is not fully utilized.
One technique that has been suggested to address the problem of under-utilized, expensive memory in a system with a large frame buffer is to use a virtual buffer of a smaller size. The virtual buffer concept is described generally in Foley, van Dam, Feiner, and Hughes, "Computer Graphics: Principles and Practice", 2nd ed. (Addison-Wesley, 1990). In a virtual buffer system, the display screen is divided into a number of fixed regions of uniform size, and a parallel rasterization buffer the size of a region is used compute the image one region at a time. Because the regions are rendered separately and processed only once to render an entire scene, geometric primitives of all the graphical objects in the scene must be sorted among the fixed display regions.
As noted in Foley, there are a couple of major disadvantages to the virtual buffer approach. First, additional memory is required to sort the primitives in a scene. This can effectively double the memory requirements if primitives transformed to screen space require the same amount of memory as primitives in object coordinates. Another disadvantage is the latency added to the display process. Sorting for one frame must be entirely completed before any primitives for that scene can be rendered.
One of the ways to deal with the disadvantages of the virtual buffer approach is to use parallel processing techniques. Using parallel processing, however, gives rise to many of the problems highlighted above.
Low cost, high-quality, real-time processing of 3-D graphics images without using a large expensive frame buffer, or parallel processing techniques has been an elusive quest for the last three decades. As is apparent from the issues outlined above, there is a need for an improved architecture capable of generating high-quality images, at a much lower cost.