The present invention relates in general to graphics processing devices and methods, and in particular to a method and apparatus to ensure consistency of depth values computed in different sections of a graphics processor.
Graphics processors are used to render images in many computer systems. In a typical rendering process, the graphics processor receives primitives (e.g., points, lines, and/or triangles) representing objects in a scene. In accordance with instructions provided by an application program, the graphics processor transforms each primitive to a viewing space, then determines which pixels (or samples, where there may be one or more samples per pixel) of the image are covered by the primitive. For each pixel that is covered, the graphics processor computes a color and depth (or z) value for the fragment of the primitive that covers (or partially covers) the pixel. The color and depth values computed for each fragment are provided to a raster operations (ROP) unit, which builds up the rendered image by storing per-pixel (or per-sample) color and depth information in a frame buffer. As the ROP unit receives new data for newly rendered fragments, it compares each new depth value for a pixel to a previous depth value stored in the frame buffer for that pixel. Based on the comparison, the ROP unit determines whether to write the new data to the frame buffer. If new data is to be written, the ROP unit updates the depth and color values in the frame buffer based on the new data.
In the course of generating an image, the ROP unit typically generates a large number of requests to transfer data to and from the frame buffer. Images may include a large number of fragments, and each time the ROP unit receives new fragment data, it reads one or more old pixels (at least the Z information) from the frame buffer. For each pixel that is to be changed, the ROP unit also reads the current color from the frame buffer. Modified colors and Z values are then written back to the frame buffer. In some graphics systems, bandwidth between the ROP unit and the frame buffer can become a bottleneck, limiting system performance.
In some graphics processors, demand for bandwidth between the ROP unit and the frame buffer is reduced by storing Z data in the frame buffer in a compressed form. The ROP unit compresses the Z data prior to writing it to the frame buffer and decompresses the Z data after reading it from the frame buffer in order to perform Z comparisons.
A related patent application, above-referenced application Ser. No. 10/878,460, describes a technique for compressing Z data by storing a planar representation of the Z coordinate for a fragment that covers all or part of a “tile” (a region of pixels, e.g., 16×16), rather than a separate Z value for each pixel in the tile. In one embodiment described therein, Z is represented using “tile-relative” coordinates, where the tile-relative coordinates (x, y) for a sample location define that location relative to a tile center (Xc, Yc) in screen space. In one example, the planar Z representation for a fragment is an ordered triple of coefficients (At, Bt, Ct) such that:Z=At*x+Bt*y+Ct  (Eq. 1)for any sample location (x, y) within the tile, where coordinates (x, y) are defined relative to the center of the tile. For each tile, a frame buffer stores one or more triples (At, Bt, Ct), depending on how many fragments at least partially cover the tile. Also stored is coverage information indicating which portion of the tile is covered by which triples (At, Bt, Ct). In one example described in application Ser. No. 10/878,460, tiles are 16×16 pixels, and up to six planar Z representations (representing fragments of up to six different surfaces that are at least partially visible in the tile) can be stored per tile. If more than six Z surfaces are needed for a particular tile, data for that tile is stored in uncompressed form. Using this technique, compression factors of 8:1 can be achieved for tiles that are covered by visible fragments of one or two Z surfaces. In some cases, the same amount of frame buffer space is allocated per tile regardless of whether the data for that tile is compressed or not. Where Z information for a tile is stored in compressed form, read and write operations for the tile require less bandwidth than for tiles where uncompressed Z data is stored.