1. The Field of the Invention
This invention relates generally to improving performance of a graphics system. Specifically, the cost of a pixel processor portion of a computer graphics system can be reduced by taking advantage of areal coherence to thereby reduce the amount of information that must be stored in each pixel. Merging is also utilized to ensure that pixel storage never grows beyond a set limit.
2. The State of the Art
The state of the art in pixel processing systems has progressed from an early state when a z-buffer engine stored a single polygon for each pixel. As each new polygon was processed, the engine first determined whether the old or the new polygon was closer to an observer""s perspective. The polygon that was furthest away from the observer was discarded, and the nearer polygon was saved for rendering. In addition, edges and interpenetrations were quantized to whole pixels.
The resulting images exhibited several anomalies. These anomalies included stair-cases, crawling, and edge scintillation. Small polygons were even displayed intermittently, while thin polygons were often broken into disjointed segments.
In an effort to improve rendering using the z-buffer engine, modifications were made to the z-buffering process. For example, a multi-sample z-buffer engine was implemented so as to divide each pixel into several sub-pixels, each sub-pixel having an associated z depth and color triple. As each polygon was processed, the sub-pixel samples that were xe2x80x9cwonxe2x80x9d were loaded with the polygon""s color (which is constant across the pixel), and with its individual z depths that are unique to each sub-pixel sample.
Occultation of pixels was then decided at the sub-pixel level by comparing new and stored z depths. The winning sub-pixel was then saved with its associated color. A display video was then determined by averaging the color values for the sub-pixels within each pixel.
Multi-sampled z-buffer engines provided improved image quality. Disadvantageously, however, the improved image quality came at the expense of greatly increased pixel data storage. The pixel data storage was particularly expensive because the frame buffer for the storage must be implemented in comparatively slow DRAM memory, which becomes a system throughput bottleneck.
A simple calculation can easily demonstrate the typical memory requirements. A minimal data record for a sub-pixel sample might be 8 bits each of red, green and blue color data, and 16 bits of depth data. A system utilizing 4 sub-pixel multisampling would therefore require 4*40 bits for the samples, plus 24 bits for the display video (which would be 48 bits if using double-buffering), plus allowance for overlay/underlay planes and some record keeping.
An even more robust approach that would be suitable for flight simulation would require 12 bit color components and 32 bits of depth. This might result in approximately 400 bits of data per pixel when fully implemented.
These examples illustrate that a substantial quantity of memory is required using the four sub-pixel multisample z-buffer engine. However, four sub-pixels provide only marginal image quality. This is typically suitable for low-end applications and games. Higher-end system can employ 16 sub-pixel multisampling, thus requiring about 2000 bits of memory for storing data.
A recent architectural innovation, the adaptive multisampler, stores fragment information as polygon records within a pixel, rather than sub-samples. This approach cuts in half the required memory space and greatly improves the handling of transparent polygons. However, this system requires a heap storage mechanism. Consequently, there is a non-deterministic amount of frame-buffer read/write activity.
It would also be an advantage over the state of the art to also employ fragment merging to thereby avoid the need for a heap mechanism. This would provide the advantages of making frame-buffer read/write activity deterministic so that it can be fully parallelized.
When examining the state of the art, it is also useful to examine a slightly different approach to pixel processing. This alternative method utilizes an A-buffer. The A-buffer is a software-only rendering method. Briefly, it solves pixels by accumulating polygon fragment data, sorting by scene depth, merging xe2x80x9clikexe2x80x9d fragments, and weighting final pixel color by visible polygon fragment areas. This is similar in concept to the ultimate goals of improved pixel processing. However, the A-buffer""s implementation has many disadvantages.
First, the A-buffer clips polygons to pixel boundaries and determines an area for each polygon/pixel fragment. This process is therefore computationally intensive. For example, the A-buffer uses both an area and a bit mask. This can also lead to subtle inconsistencies in overall pixel treatment. Furthermore, the A-buffer cannot accommodate overlapped/outrigger bit-mask strategies that fit more naturally into other strategies.
The A-buffer also utilizes a complex linked-list data structure to track polygon fragments within a pixel. The list must be traced to its conclusion, so processing is highly variable. Furthermore, the linked list can be randomly distributed throughout the memory address space, further hindering cache coherency and slowing memory access. It would be an advantage to utilize a data block of constant size, so that memory accesses are localized and deterministic.
The A-buffer also defers hidden surface removal until a final shade-resolve step. Accordingly, it must deal with many polygon fragments that may not ultimately contribute to the pixel color. It would be another advantage to erase sub-pixel portions of fragments as they are occulted by incoming new fragments. This make the associated memory available for re-use as early as possible in the rendering process.
The A-buffer also sorts fragments by front-most Z, and contains no information about the orientation of the fragment. It would by another advantage to store Z-slope information that enables full reconstruction of the fragment geometry. This method would lend itself well to multiple-pixel rendering areas without degradation.
The A-buffer must resort to an intersect/merge process when fragments are close in Z, even when they don""t actually intersect. Therefore, it would be another advantage to increase sub-pixel occultation to a higher resolution. This would prevent hidden surfaces from xe2x80x9cbleeding throughxe2x80x9d as occurs with the A-buffer where many polygons meet at a vertex.
The A-buffer also requires a significant amount of memory space. For example, the A-buffer needs 64 bits for a simple surrounder or for the first list pointer, and 192 bits for each additional fragment linked to a pixel. Accordingly, for an average depth complexity of 4, the A-buffer requires 832 bits per pixel, plus the final RGB (reg, green and blue) value. In contrast, it would be an advantage to reduce memory requirements to approximately 256 bits per pixel for the same depth complexity.
It would therefore be an improvement over the state of the art to take advantage of the improved image quality of the adaptive multisampler approach, while taking advantage of areal coherence to thereby reduce memory requirements.
It is an object of the present invention to provide a method for improved pixel processing by utilizing a span-based multisample z-buffer engine.
It is another object to utilize a span-based multisample z-buffer engine to thereby enable frame-buffer read/write activity to be deterministic.
It is another object to utilize a span-based multisample z-buffer engine which can be fully parallelized.
It is another object to utilize a span-based multisample z-buffer engine while taking advantage of areal coherence to thereby reduce memory requirements of the pixel processor.
It is another object to utilize fragment merging to thereby avoid the need for a heap mechanism.
The presently preferred embodiment of the present invention is a method for creating a span-based multisample Z-buffer pixel processor in a computer graphics system to thereby reduce a quantity of data that must be stored for each pixel in a frame buffer thereof. By taking advantage of areal coherence, the quantity of data that must be stored in each pixel is reduced. By employing merging, the method is also able to ensure that pixel storage requirements do not grow beyond a predetermined limit.
In a first aspect of the invention, each address within frame buffer memory is defined to correspond to a group of four pixels that are arranged as a contiguous 2xc3x972 array. Data within each span record is associated with individual polygon fragments that are visible in one or more of the four pixels in a group.
In a second aspect of the invention, the pixel processor receives polygon fragments for each pixel. As new fragments are accumulated within a group of four pixels, it is possible that eventually there will be more data to write back to the frame buffer than there is space available. When this occurs, the system merges one or more pairs of polygon fragments until the data will fit, thus effectively limiting memory requirements.
In a third aspect of the invention, the accumulated polygon fragments can be resolved into display video for each of the four pixels in a group at any instant in time.
These and other objects, features, advantages and alternative aspects of the present invention will become apparent to those skilled in the art from a consideration of the following detailed description taken in combination with the accompanying drawings.