Field of the Invention
Embodiments of the present invention relate generally to graphics processing and, more specifically, to higher accuracy z-culling in a tile-based architecture.
Description of the Related Art
Graphics processing subsystems typically write depth information to a depth buffer that stores depth information for samples and pixels. When the graphics processing subsystem processes fragments, the graphics processing system compares depth values associated with those fragments with the depth values stored in the depth buffer. This comparison is referred to as a “z-test,” “depth test”, or “visibility test”. For any particular fragment, if the z-test is successful and the fragment is visible, then the fragment is written to the frame buffer or blended with color data already in the frame buffer. If, on the other hand, the fragment does not pass the visibility test, then the fragment is discarded.
Some graphics processing subsystems also implement a tile-based architecture, where one or more render targets, such as a frame buffer, are divided into screen space partitions referred to as “tiles.” In such an architecture, the graphics processing subsystem rearranges work such that the work associated with any particular tile remains on-chip, in a cache, for a longer time relative to the time work remains on-chip with an architecture that does not rearrange work based on tiles. This rearrangement helps to improve memory bandwidth as compared with a non-tiling architecture.
Some graphics processing subsystems additionally include a z-cull unit that is configured to perform depth-based culling operations on fragments prior to fragment shading. Such operations enable non-visible fragments to be discarded from the graphics pipeline prior to fragment shading, which saves the processing cycles and power associated with shading fragments that are non-visible and ultimately would be discarded during z-testing.
In tiling architectures, and in other architectures, the z-cull unit typically processes data at a fairly coarse level. More specifically, the z-cull unit stores a small amount of z-data for a large number of samples and performs a z-cull test that compares those samples to the z-data. The test performed by the z-cull unit is typically “conservative,” meaning that a z-cull test retains all fragments in a group of fragments for further processing if just one of those fragments passes the z-cull test. Therefore, in many situations, some fragments that are found to be non-visible, are nonetheless transmitted on to the fragment shader for further processing instead of being discarded.
The “resolution” or accuracy of the z-cull test can substantially influence how many non-visible fragments are transmitted to the fragment shader for further processing Increasing the resolution of the z-cull test typically decreases the number of fragments found to be non-visible that are nonetheless transmitted to the fragment shader for processing. However, increasing the resolution of z-cull testing typically causes such testing to take longer and increases the memory requirements of the z-cull unit, which can negatively impact overall system performance.
As the foregoing illustrates, what is needed in the art are more effective approaches to z-culling, especially in tile-based architectures.