1. Field of the Invention
The present invention relates generally to the field of graphics processing and more specifically to a system and method for structuring an A-Buffer.
2. Description of the Related Art
A typical computing system includes a central processing unit (CPU) and a graphics processing unit (GPU). Some GPUs are capable of very high performance using a relatively large number of small, parallel execution threads on dedicated programmable hardware processing units. The specialized design of such GPUs usually allows these GPUs to perform certain tasks, such as rendering 3-D scenes, much faster than a CPU. However, the specialized design of these GPUs also limits the types of tasks that the GPU can perform. The CPU is typically a more general-purpose processing unit and therefore can perform most tasks. Consequently, the CPU usually executes the overall structure of the software application and configures the GPU to perform specific tasks in the graphics pipeline (the collection of processing steps performed to transform 3-D images into 2-D images).
One task that may be performed when transforming 3-D images into 2-D images is to determine the visible color of each pixel in the image. To accurately determine the color at each pixel, the fragment of each pixel intersected by an object may be evaluated to create a portion of the overall color of the pixel that includes a variety of effects, such as transparency and depth complexity of field. In some approaches, objects may need to be rendered in a specific order to ensure that the visible color of each pixel in the generated image is realistic. However, in other approaches the fragments corresponding to each pixel may be collected, sorted, and reduced to an image which accurately displays advanced effects irrespective of the order in which the objects are rendered. One structure that computing systems may implement in these various approaches is an A-Buffer—a memory structure that maintains a collection of fragments associated with each polygon that intersects the pixels in an image frame being rendered for display.
In one approach to organizing an A-Buffer, the CPU is used to create and address, in software, a linked list of fragments per pixel. One drawback to this approach is that the A-Buffer structure is not conducive to some of the most frequent fragment access patterns. For example, when a polygon is rendered, many of the corresponding fragments are located in adjacent pixels at the same depth complexity. As is well known, memory accesses are usually most efficient if they can be grouped such that successive accesses target data that is closely located within the memory—known as memory locality. Thus, an efficient memory structure would enable fragments of this nature, located at the same depth complexity, to be simultaneously accessed. However, since each pixel in the A-Buffer has a separate linked list, fragments corresponding to adjacent pixels at the same depth complexity may be far apart in memory, leading to poor memory locality. Consequently, accesses to the A-Buffer may be inefficient.
Another drawback to this approach is that the CPU ends up performing many of the A-Buffer operations rather than the GPU. As is well-known, graphics processing efficiency is increased when graphics processing operations are consolidated in the GPU. This problem can be addressed by having the GPU create and address the A-Buffer using the linked list approach detailed above, but such a solution does not address the depth complexity locality problem. Furthermore, linked list operations are not usually performed efficiently in hardware, such as the raster operations unit in the GPU, and, therefore, such a solution would not necessarily improve the overall efficiency of A-Buffer operations.
As the foregoing illustrates, what is needed in the art is a more effective technique for creating and accessing an A-Buffer.