Current parallel graphics data processing includes systems and methods developed to perform specific operations on graphics data such as, for example, linear interpolation, tessellation, rasterization, texture mapping, depth testing, etc. Traditionally, graphics processors used fixed function computational units to process graphics data. However, more recently, portions of graphics processors have been made programmable, enabling such processors to support a wider variety of operations for processing vertex and fragment data.
During graphics processing a rasterization process is performed to convert an image from a vector graphics format into a raster image (e.g., pixels). Subsequently, depth buffering (or z-buffering) is performed to determine elements of a rendered scene that are visible, and which are hidden. Particularly, a Hierarchical Z-buffer feature may be implemented to enable a pixel being rendered to be checked against the z-buffer before the pixel actually arrives in the rendering pipelines. However, the magnitude of data that has to be transferred between the rasterization and Hierarchical Z-buffer blocks often slows down graphics processor performance due to the limited wiring and power that is used by the buses that couple the blocks.