1. Field of the Invention
This invention relates generally to the field of computer graphics and, more particularly, to high performance computer graphics systems.
2. Description of the Related Art
A computer system typically relies upon its graphics system for producing visual output on the computer screen or display device. Early graphics systems were only responsible for taking what the processor produced as output and displaying it on the screen. In essence, they acted as simple translators or interfaces. Modern graphics systems, however, incorporate graphics processors with a great deal of processing power. They now act more like coprocessors rather than simple translators. This change is due to the recent increase in both the complexity and amount of data being sent to the display device. For example, modern computer displays have many more pixels, greater color depth, and are able to display more complex images with higher refresh rates than earlier models. Similarly, the images displayed are now more complex and may involve advanced techniques such as anti-aliasing and texture mapping.
As a result, without considerable processing power in the graphics system, the CPU would spend a great deal of time performing graphics calculations. This could rob the computer system of the processing power needed for performing other tasks associated with program execution and thereby dramatically reduce overall system performance. With a powerful graphics system, however, when the CPU is instructed to draw a box on the screen, the CPU is freed from having to compute the position and color of each pixel. Instead, the CPU may send a request to the video card to draw a box at specified coordinates. The graphics system then draws the box, freeing the processor to perform other tasks.
Generally, a graphics system in a computer (also referred to as a graphics system) is a type of video adapter that contains its own processor to boost performance levels. These processors are specialized for computing graphical transformations, so they tend to achieve better results than the general-purpose CPU used by the computer system. In addition, they free up the computer""s CPU to execute other commands while the graphics system is handling graphics computations. The popularity of graphical applications, and especially multimedia applications, has made high performance graphics systems a common feature of computer systems. Most computer manufacturers now bundle a high performance graphics system with their computers.
Since graphics systems typically perform only a limited set of functions, they may be customized and therefore far more efficient at graphics operations than the computer""s general-purpose central processor. While early graphics systems were limited to performing two-dimensional (2D) graphics, their functionality has increased to support three-dimensional (3D) wire-frame graphics, 3D solids, and now includes support for three-dimensional (3D) graphics with textures and special effects such as advanced shading, fogging, alpha-blending, and specular highlighting.
While the number of pixels is an important factor in determining graphics system performance, another factor of equal import is the quality of the image. For example, an image with a high pixel density may still appear unrealistic if edges within the image are too sharp or jagged (also referred to as xe2x80x9caliasedxe2x80x9d). One well-known technique to overcome these problems is anti-aliasing. Anti-aliasing involves smoothing the edges of objects by shading pixels along the borders of graphical elements. More specifically, anti-aliasing entails removing higher frequency components from an image before they cause disturbing visual artifacts. For example, anti-aliasing may soften or smooth high contrast edges in an image by forcing certain pixels to intermediate values (e.g., around the silhouette of a bright object superimposed against a dark background).
Another visual effect used to increase the realism of computer images is alpha blending. Alpha blending is a technique that controls the transparency of an object, allowing realistic rendering of translucent surfaces such as water or glass. Another effect used to improve realism is fogging. Fogging obscures an object as it moves away from the viewer. Simple fogging is a special case of alpha blending in which the degree of alpha changes with distance so that the object appears to vanish into a haze as the object moves away from the viewer. This simple fogging may also be referred to as xe2x80x9cdepth cueingxe2x80x9d or atmospheric attenuation, i.e., lowering the contrast of an object so that it appears less prominent as it recedes. More complex types of fogging go beyond a simple linear function to provide more complex relationships between the level of translucence and an object""s distance from the viewer. Current state of the art software systems go even further by utilizing atmospheric models to provide low-lying fog with improved realism.
While the techniques listed above may dramatically improve the appearance of computer graphics images, they also have certain limitations. In particular, they may introduce their own aberrations and are typically limited by the density of pixels displayed on the display device.
As a result, a graphics system is desired which is capable of utilizing increased performance levels to increase not only the number of pixels rendered but also the quality of the image rendered. In addition, a graphics system is desired which is capable of utilizing increases in processing power to improve the results of graphics effects such as anti-aliasing.
Prior art graphics systems have generally fallen short of these goals. Prior art graphics systems use a conventional frame buffer for refreshing pixel/video data on the display. The frame buffer stores rows and columns of pixels that exactly correspond to respective row and column locations on the display. Prior art graphics system render 2D and/or 3D images or objects into the frame buffer in pixel form, and then read the pixels from the frame buffer during a screen refresh to refresh the display. Thus, the frame buffer stores the output pixels that are provided to the display. To reduce visual artifacts that may be created by refreshing the screen at the same time the frame buffer is being updated, most graphics systems"" frame buffers are double-buffered.
To obtain more realistic images, some prior art graphics systems have gone further by generating more than one sample per pixel. As used herein, the term xe2x80x9csamplexe2x80x9d refers to calculated color information that indicates the color, depth (z), transparency, and potentially other information, of a particular point on an object or image. For example a sample may comprise the following component values: a red value, a green value, a blue value, a z-depth value, and an alpha value (e.g., representing the transparency of the sample). A sample may also comprise other information, e.g., a blur value, an intensity value, or brighter-than-bright information. By calculating more samples than pixels (i.e., super-sampling), a more detailed image is calculated than can be displayed on the display device. For example, a graphics system may calculate four samples for each pixel to be output to the display device. After the samples are calculated, they are then combined or filtered to form the pixels that are stored in the frame buffer and then conveyed to the display device. Using pixels formed in this manner may create a more realistic final image because the filtering process may smooth overly abrupt changes in the image. Details of one type of super-sampling graphics system can be found in co-pending U.S. patent application Ser. No. 09/251,840, filed Feb. 17, 1999, by Michael F. Deering entitled xe2x80x9cA Graphics System With A Variable-Resolution Sample Buffer,xe2x80x9d which is incorporated by reference in its entirety.
Super-sampling has been used for the last decade as a method to blend the information contained in many samples clustered about a pixel location to achieve a more visually acceptable rendering of the original objects. In early versions of super-sampling, samples were processed off-line for a single frame and then reassembled in sequence later for real time viewing. Later versions of super-sampling process a pixel""s worth of samples in-line to calculate new data, read the old data from the frame buffer for the pixel, compare old and new data to determine if a blend or replacement is required, and then write the updated data back to the frame buffer. This multi-step process involves many read and write operations to the frame buffer. The time required for the process is therefore related to the clock speed of the frame buffer. Many sample points may be included in more than one pixel""s sample region, thus further contributing to the inefficiency of this process. To meet the demands for more realistic graphic displays (more filtering of more samples) and increased resolution (more pixels), a faster and more efficient method of super-sampling is needed.
The problems set forth above may at least in part be solved by a high-speed graphics system that utilizes, in one embodiment, a graphics processor, a sample buffer, a sample cache (for storing a selected sub-set of the sample buffer), a sample-to-pixel calculation (filtering) unit, and a frame buffer (for storing the calculated pixel values).
The graphics processor generates a plurality of samples and stores them into the sample buffer. In one embodiment, the graphics processor typically generates and stores a total number of samples far greater than the number of pixel locations on the display.
The sample-to-pixel calculation unit is configured in one embodiment to read the sub-set of samples from the sample buffer, store them in the sample cache, and filter or convolve the samples into a respective output pixel. The output pixel is then stored in the frame buffer and used to refresh a display. Note as used herein the terms xe2x80x9cfilterxe2x80x9d and xe2x80x9cconvolvexe2x80x9d are used interchangeably and refer to mathematically manipulating one or more samples to generate a pixel (e.g., by averaging, by applying a convolution function, by summing, by applying a filtering function, by weighting the samples and then manipulating them, by applying a randomized function, or by combinations of these and other contemplated examples). The sample-to-pixel calculation unit selects one or more samples and filters them to generate an output pixel. Note the number of samples selected and or filtered by the sample-to-pixel calculation unit may be one or, in another embodiment, greater than one.
In some embodiments, the number of samples used to form each pixel may vary. For example, the underlying average sample density in the sample buffer may vary, the extent of the filter may vary, or the number of samples for a particular pixel may vary due to stochastic variations in the sample density. In some embodiments the number may vary on a per-pixel basis, on a per-scan line basis, on a per-region basis, on a per-frame basis, or the number may remain constant.
In some embodiments, the graphics processor is further configurable to vary the positioning of the samples generated. For example, the samples may be positioned according to a regular grid, a perturbed regular gird, or in regions of higher or lower sample density. In one embodiment, the sample positions may be stored as offsets rather than absolute addresses or coordinates. In one embodiment, the graphics processor is operable to programmatically configure or vary the sample positions on a frame-by-frame basis.
A software program embodied on a computer medium and a method for operating a graphics subsystem are also contemplated. In one embodiment, the method comprises first calculating a plurality of sample locations and corresponding sample values (color, transparency, and others). The samples may then be stored into a sample buffer. The sample locations may be specified according to any number of positioning or spacing schemes, e.g., a regular grid, a perturbed regular grid, or a stochastic grid. Subsets of the stored samples may then be selected and filtered to form output pixels, which are stored in a traditional frame buffer. The samples may be selected according to their distance from the center of the convolution kernel (which may correspond to the estimated center of the output pixel). The selected samples may be multiplied by a weighting factor and summed. The output pixel is also normalized (e.g., through the use of pre-normalized weighting factors that are looked up, or by dividing the summed sample values by a calculated or pre-calculated normalization factor). In some embodiments, the selection process, weighting process, and normalization process are each programmable and changeable for each particular frame or window.
An increase in speed in some embodiments of the computer graphics system may be achieved in part by use of the sample cache for temporary storage of the selected sub-set of the sample buffer. In some embodiments, a faster clock rate may also be used by the sample cache and by the sample-to-pixel calculation unit. In some embodiments, the time required to access data in the sample cache may be approximately ⅙ the time required to access data from the sample buffer. The use of a sample cache may also reduce the number of reads required from the sample buffer in some embodiments. The sample cache may also allow samples to be reused in the calculations for more than one pixel without additional sample buffer reads. The sample cache memory may also be configured to allow the replacement of samples no longer needed with new samples from the sample buffer while pixel values are being calculated.