1. Field of the Invention
This invention relates generally to the field of computer graphics and, more particularly, to high performance graphics systems.
2. Description of the Related Art
A computer system typically relies upon its graphics system for producing visual output on a computer screen or display device. Early graphics systems were only responsible for taking what the processor produced as output and displaying it on the screen. In essence, they acted as simple translators or interfaces. Modem graphics systems, however, incorporate graphics processors with a great deal of processing power. The graphics systems now act more like coprocessors rather than simple translators. This change is due to the recent increase in both the complexity and amount of data being sent to the display device. For example, modem computer displays have many more pixels, greater color depth, and are able to display images with higher refresh rates than earlier models. Similarly, the images displayed are now more complex and may involve advanced rendering and visual techniques such as anti-aliasing and texture mapping.
As a result, without considerable processing power in the graphics system, the computer""s system CPU would spend a great deal of time performing graphics calculations. This could rob the computer system of the processing power needed for performing other tasks associated with program execution and thereby dramatically reduce overall system performance. With a powerful graphics system, however, when the CPU is instructed to draw a box on the screen, the CPU is freed from having to compute the position and color of each pixel. Instead, the CPU may send a request to the video card stating: xe2x80x9cdraw a box at these coordinatesxe2x80x9d. The graphics system then draws the box, freeing the CPU to perform other tasks.
Generally, a graphics system in a computer (also referred to as a graphics system) is a type of video adapter that contains its own processor to boost performance levels. These processors are specialized for computing graphical transformations, so they tend to achieve better results than the general-purpose CPU used by the computer system. In addition, they free up the computer""s CPU to execute other commands while the graphics system is handling graphics computations. The popularity of graphical applications, and especially multimedia applications, has made high performance graphics systems a common feature of computer systems. Most computer manufacturers now bundle a high performance graphics system with their systems.
Since graphics systems typically perform only a limited set of functions, they may be customized and therefore far more efficient at graphics operations than the computer""s general-purpose microprocessor. While early graphics systems were limited to performing two-dimensional (2D) graphics, their functionality has increased to support three-dimensional (3D) wire-frame graphics, 3D solids, and now includes support for textures and special effects such as advanced shading, fogging, alpha-blending, and specular highlighting.
The rendering ability of 3D graphics systems has been improving at a breakneck pace. A few years ago, shaded images of simple objects could only be rendered at a few frames per second, but today""s systems support rendering of complex objects at 60 Hz or higher. At this rate of increase, in the not too distant future, graphics systems will literally be able to render more pixels in xe2x80x9creal-timexe2x80x9d than a single human""s visual system can perceive. While this extra performance may be useable in multiple-viewer environments, it may be wasted in the more common single-viewer environments. Thus, a graphics system is desired which is capable of utilizing the increased graphics processing power to generate images that are more realistic.
While the number of pixels and frame rate is important in determining graphics system performance, another factor of equal or greater importance is the visual quality of the image generated. For example, an image with a high pixel density may still appear unrealistic if edges within the image are too sharp or jagged (also referred to as xe2x80x9caliasedxe2x80x9d). One well-known technique to overcome these problems is anti-aliasing. Anti-aliasing involves smoothing the edges of objects by shading pixels along the borders of graphical elements. More specifically, anti-aliasing entails removing higher frequency components from an image before they cause disturbing visual artifacts. For example, anti-aliasing may soften or smooth high contrast edges in an image by forcing certain pixels to intermediate values (e.g., around the silhouette of a bright object superimposed against a dark background).
Another visual effect used to increase the realism of computer images is alpha blending. Alpha blending is a technique that controls the transparency of an object, allowing realistic rendering of translucent surfaces such as water or glass. Another effect used to improve realism is fogging. Fogging obscures an object as it moves away from the viewer. Simple fogging is a special case of alpha blending in which the degree of alpha changes with distance so that the object appears to vanish into a haze as the object moves away from the viewer. This simple fogging may also be referred to as xe2x80x9cdepth cueingxe2x80x9d or atmospheric attenuation, i.e., lowering the contrast of an object so that it appears less prominent as it recedes. More complex types of fogging go beyond a simple linear function to provide more complex relationships between the level of translucence and an object""s distance from the viewer. Current state of the art software systems go even further by utilizing atmospheric models to provide low-lying fog with improved realism.
While the techniques listed above may dramatically improve the appearance of computer graphics images, they also have certain limitations. In particular, they may introduce their own aberrations and are typically limited by the density of pixels displayed on the display device.
As a result, a graphics system is desired which is capable of utilizing increased performance levels to increase not only the number of pixels rendered but also the quality of the image rendered. In addition, a graphics system is desired which is capable of utilizing increases in processing power to improve the results of graphics effects such as anti-aliasing.
Prior art graphics systems have generally fallen short of these goals. Prior art graphics systems use a conventional frame buffer for refreshing pixel/video data on the display. The frame buffer stores rows and columns of pixels that correspond to respective row and column locations on the display. Prior art graphics systems render 2D and/or 3D images or objects into the frame buffer in pixel form, and then read the pixels from the frame buffer during a screen refresh to refresh the display. Thus, the frame buffer stores the output pixels that are provided to the display. To reduce visual artifacts that may be created by refreshing the screen at the same time the frame buffer is being updated, most graphics systems"" frame buffers are double-buffered.
To obtain images that are more realistic, some prior art graphics systems have gone further by generating more than one sample per pixel. As used herein, the term xe2x80x9csamplexe2x80x9d refers to calculated color information that indicates the color, depth (z), and potentially other information, of a particular point on an object or image. For example, a sample may comprise the following component values: a red value, a green value, a blue value, a z value, and an alpha value (e.g., representing the transparency of the sample). A sample may also comprise other information, e.g., a z-depth value, a blur value, an intensity value, and an indicator that the sample consists partially or completely of control information rather than color information (i.e., xe2x80x9csample control informationxe2x80x9d). By calculating more samples than pixels (i.e., super-sampling), a more detailed image is calculated than can be displayed on the display device. For example, a graphics system may calculate four samples for each pixel to be output to the display device. After the samples are calculated, they may then be combined or filtered to form the pixels that are stored in the frame buffer and then conveyed to the display device. Using pixels formed in this manner may create a more realistic final image because overly abrupt changes in the image may be smoothed by the filtering process.
These prior art super-sampling systems typically generate a number of samples that are far greater than the number of pixel locations on the display. These prior art systems typically have rendering processors that calculate the samples and store them into a render buffer. Filtering hardware then reads the samples from the render buffer, filters the samples to create pixels, and then stores the pixels in a traditional frame buffer. The traditional frame buffer is typically double-buffered, with one side being used for refreshing the display device while the other side is updated by the filtering hardware. Once the samples have been filtered, the resulting pixels are stored in a traditional frame buffer that is used to refresh the display device. These systems, however, have generally suffered from limitations imposed by the conventional frame buffer and by the added latency caused by the render buffer and filtering. Therefore, an improved graphics system is desired which includes the benefits of pixel super-sampling while avoiding the drawbacks of the conventional frame buffer.
Memory devices are reaching a level of complexity where they may be programmed to operate on input data and/or output data in a programmably determined fashion. Exemplary of such memory devices is the 3DRAM family of devices manufactured by Mitsubishi Electric Corporation. Because of their flexibility, graphics designers are encouraged to incorporate them into graphics systems. Separate process and/or hardware devices writing to the memory devices or reading from the memory devices may require different types of behavior from the memory devices. Thus, before reading or writing to such a memory device an input processor or output processor may need to reprogram the memory context (the set of state registers internal to the memory device that determine the memory device""s behavior). This context switch incurs a nontrivial time-delay. Thus, there exists a need for a graphics system and method which can control the context switching for one or more input processes and/or output processes.
In one set of embodiments, a graphics system may comprise a programmable sample buffer and a sample buffer interface. The sample buffer interface may receive and buffer N streams of samples in N corresponding input buffers, where N is an integer greater than or equal to two. The sample buffer interface may include a context memory which stores N sets of context values corresponding to the N input buffers respectively. The sample buffer interface may be configured to (1) terminate transfer of samples from a first of the input buffers to the programmable sample buffer, (2) selectively update a subset of state registers in the programmable sample buffer with context values corresponding to a next input buffer of the input buffers, and (3) initiate transfer of samples from the next input buffer to the programmable sample buffer. The context values stored in the state registers of the programmable sample buffer determine the operation of an arithmetic logic unit internal to the programmable sample buffer on samples data.
In another set of embodiments, a method for controlling the flow of multiple streams of data to a programmable memory (e.g. a sample buffer) may be arranged as follows. The programmable memory may include a memory array, an arithmetic logic unit and a set of state registers. The arithmetic logic unit may operate on the input data (i.e. data transferred to the programmable memory from an external source) and data previously stored in the memory array based on the contents of the state registers. The output of the arithmetic logic unit may be stored in the memory array. The programmable memory may be configured to bypass the arithmetic logic unit. Thus, input data may be written directly to the memory array without modification.
An interface unit (e.g. the sample buffer interface) may buffer N streams of sample data in N corresponding input buffers, where N is an integer greater than or equal to two. Upon terminating the transfer of samples from a current one of the input buffers to the programmable memory, the interface unit may selectively update a subset of the state registers in the programmable memory with context values corresponding to a next input buffer of the input buffers. In some cases, the subset of state registers to be updated may be an empty subset if there are no state registers that need to be updated, i.e. if the set of context values for current input buffer and the set of context values for the next input buffer are identical. After updating the subset of state registers, the interface unit may initiate transfer of samples from the next input buffer to the programmable memory.