Graphics processing systems can be configured to receive graphics data, e.g. from an application such as a game, and to process the graphics data in order to generate an image representing a view of a three dimensional (3D) scene. The graphics data may comprise data defining primitives (e.g. the location and appearance of the primitives) which form objects in the 3D scene. Typical operations performed by a graphics processing system include: (i) texturing and/or shading the primitives according to the data defining the primitives, and (ii) determining which parts of the primitives are visible from the viewpoint that the scene is being rendered from, and removing surfaces which are determined not to be visible (which is known in the art as “hidden surface removal” or HSR). Some graphics processing systems (e.g. immediate mode rendering systems) perform the texturing and/or shading operations before the hidden surface removal operations. However, to avoid the processing cost of unnecessarily texturing and shading surfaces which are not visible in the final rendered image, some other graphics processing systems (e.g. deferred rendering systems) perform the hidden surface removal operations before the texturing and/or shading operations.
The HSR and texturing/shading operations are performed by a graphics processing system on-chip and typically involve reading large quantities of primitive data from an off-chip memory, and writing rendered pixel data back to that memory. The rendering process may generate large quantities of intermediate data such as depth data and fragment data which may also need to be stored. Passing data between the off-chip memory and the on-chip graphics processing system may incur delays and use significant amounts of power, so it can be beneficial to store some of the data on-chip whilst the graphics processing system acts on the primitive data. Typically, there will not be enough on-chip memory to store all of the data for rendering a whole image. Therefore, some graphics processing systems have a rendering space which is sub-divided into a plurality of tiles, wherein each tile can be processed separately, such that data for a particular tile can be stored on-chip while the graphics processing systems acts on the primitives that are present within that particular tile. For example, depth data for a tile is stored on-chip while HSR processing is performed for the tile. This helps to reduce the amount of data transferred to and from the off-chip memory, and to reduce latency when the graphics processing system processes the primitive data.
Systems which have a rendering space that is sub-divided into a plurality of tiles may include a tiling unit (which may be referred to as a “tile accelerator,” or simply a “TA”). The tiling unit processes the input graphics data (which includes data defining primitives in the 3D scene to be rendered) and performs tiling calculations to thereby determine, for each of the primitives, which of the tiles the primitive is present in. A primitive is determined to be present in a tile if the primitive, when projected into the rendering space, is determined to overlap wholly or partially with the tile.
The tiling unit generates tile control streams for the tiles, whereby the tile control stream for a tile includes indicators of primitives which are present in that tile. For example a tile control stream for a tile may include state information for the tile, and a list of object pointers to indicate the primitives that were determined to be present in the tile. The object pointers may refer to primitive data stored elsewhere, such that a primitive may be referenced by more than one tile's control stream. That is, the primitive data may be stored (e.g. in primitive blocks) separately from the control streams of the tiles, such that the control streams are per-tile, whereas the primitive data does not need to be stored per-tile. Collectively, the control streams and the primitive data may be referred to as a “display list.” In other examples the primitive data for the indicated primitives may be included in the control stream. The tile control streams are stored, and subsequently passed to a HSR unit (which may be referred to as an “image synthesis processor,” or simply an “ISP”) which is configured to perform the hidden surface removal on a tile-by-tile basis on the fragments of the primitives which are present in a tile. The “fragments” of the primitives are the portions of the primitives which overlap with respective sample positions of the final image to be rendered. The “sample positions” represent the discrete positions of the final image at which the graphics processing system operates to determine the appearance of the scene. For example, the sample positions may correspond to the pixel positions of the final image, but in order to provide for greater accuracy in the final image, each pixel position of the final image may be represented by a block of more than one sample position.
As is known in the art, the HSR unit may perform the HSR for the fragments of primitives which are present in a tile using a technique known as “Z-buffering” in which the depth values of each primitive in the tile are calculated at each sample position and are compared with a previously stored depth. The primitives are processed in a sequential manner and, in a simple example in which all of the primitives are opaque, a depth buffer stores, for each sample position, the depth of the closest fragment (i.e. closest to the viewpoint of the scene) which has been processed by the HSR unit, and a tag buffer stores, for each sample position, a tag (i.e. an identifier) of the primitive which has the closest depth at that sample position. When a new primitive is processed in the HSR unit, the depth values of the fragments of the primitive are compared with the depth values in the depth buffer in accordance with a depth compare mode, and in a usual depth compare mode if the fragments of the new primitive are closer than the fragments whose depth values are stored in the depth buffer at the appropriate positions, then the depth values of the fragments of the new primitive are stored in the depth buffer and a tag of the new primitive is stored in the tag buffer, to replace any existing depth and tag values at the appropriate positions. The HSR unit sends the visible surface information to a texturing and shading unit where the fragments are textured and/or shaded before being sent to a frame buffer for display.
The functions of the tiling unit and the HSR unit are different. That is, the tiling unit performs the tiling calculations and the HSR unit performs the depth testing for hidden surface removal. Typically, the tiling unit does not perform depth testing.
The system described above works well for processing opaque primitives. However, some primitives may have an object type allowing for translucency and/or punch through. The translucency or punch through may be represented by the textures to be applied to the fragments of the primitives (e.g. as represented by an alpha value if the texture includes RGBA values—Red, Green, Blue and Alpha values). A punch through primitive includes some “holes” in the sense that not all of the sample positions within the edges of the primitive generate visible fragments. Punch through fragments are created initially for all sample positions within the primitive, and a test (which may be referred to as an “alpha test”) indicates whether a fragment should be included in the subsequent processing, or whether it should be discarded, i.e. it is part of a hole in the primitive. For a punch through primitive, the depth buffer is updated and the fragment rendered only for those fragments that pass both the depth and alpha tests. Since, in deferred rendering systems, the texturing of the fragments of the primitives (which determines whether the fragments are punch through fragments) occurs after the hidden surface removal, the HSR unit might not be able to determine which fragments are visible until after the texturing has been applied to the fragments. This may result in a feedback loop from the texturing/shading unit to the HSR unit to indicate the alpha state of the fragments so that the HSR unit can properly update the depth buffer. A feedback loop such as this can reduce the efficiency of the graphics processing system since the HSR unit may process some aspects of the same fragment twice.
Inefficiencies may also occur when processing primitives with an object type that allows translucency and/or punch through, if fragments from those primitives are covered by opaque fragments. If for example, a fragment (“fragment A”) which is currently the closest fragment at a particular sample position is subsequently covered by a translucent fragment (“fragment B”), then the tag of fragment A may need to be flushed from the tag buffer. Fragment A is then processed by the texturing/shading unit and the tag for fragment B is stored in the tag buffer. The texturing/shading unit can subsequently perform a blend operation of fragments A and B. However, if fragment B is then covered by an opaque fragment (“fragment C”) then both fragments A and B will be hidden in the final image and the processing performed on both of those fragments has been wasted.