The technology described herein relates to the processing of computer graphics, and in particular to hidden surface removal in graphics processing.
Graphics processing is normally carried out by first dividing the graphics processing (render) output, such as a frame to be displayed, into a number of similar basic components (so-called “primitives”) to allow the graphics processing operations to be more easily carried out. These “primitives” are usually in the form of simple polygons, such as triangles.
Once the primitives have been generated and defined, they can be processed by the graphics processing system, in order, e.g., to display the frame.
This process basically involves determining which sampling points of an array of sampling points covering the output area to be processed are covered by a primitive, and then determining the appearance each sampling point should have (e.g. in terms of its colour, etc.) to represent the primitive at that sampling point. These processes are commonly referred to as rasterising and rendering, respectively.
The rasterising process determines the sampling points that should be used for a primitive (i.e. the (x, y) positions of the sample points to be used to represent the primitive in the render output, e.g. frame to be displayed). This is typically done using the positions of the vertices of a primitive.
The rendering process then derives the data, such as red, green and blue (RGB) colour values and an “Alpha” (transparency) value, necessary to represent the primitive at the sample points (i.e. “shades” each sample point). This can involve performing fragment shading, applying textures, blending sample point data values, etc.
These processes are typically carried out by testing sets of one, or of more than one, sampling point, and then generating for each set of sampling points found to include a sample point that is inside (covered by) the primitive in question (being tested), a discrete graphical entity usually referred to as a “fragment” on which the graphics processing operations (such as rendering) are carried out. Covered sampling points are thus, in effect, processed as fragments that will be used to render the primitive at the sampling points in question. The “fragments” are the graphical entities that pass through the rendering process (the rendering pipeline). Each fragment that is generated and processed may, e.g., represent a single sampling point or a set of plural sampling points, depending upon how the graphics processing system is configured.
(Correspondingly, each graphics fragment may typically be the same size and location as a “pixel” of the output (e.g. output frame), but it can be the case that there is not a one-to-one correspondence between a fragment and a display pixel, for example where particular forms of post-processing, such as downsampling, are carried out on the rendered image prior to displaying the final image.)
One drawback of current graphics processing systems is that because primitives are processed sequentially, and typically not in perfect front-to-back order, a given sampling point (and hence fragment and pixel) may be shaded multiple-times as an output is processed, e.g. for display. This occurs when a first received and rendered primitive is subsequently covered by a later primitive, such that the rendered first primitive is not in fact seen at the pixel(s) (and sampling point(s)) in question. Primitives can be overwritten many times in this manner and this typically leads to multiple, ultimately redundant, rendering operations being carried out for each render output, e.g. frame, being rendered. This phenomenon is commonly referred to as “overdraw”.
The consequences of performing such ultimately redundant operations include reduced frame rates and increased memory bandwidth requirements (e.g. as a consequence of fetching data for primitives that will be overwritten by later primitives). Both of these things are undesirable and reduce the overall performance of a graphics processing system. These problems will tend to be exacerbated as render outputs, such as frames to be rendered, become larger and more complex (as there will be more surfaces in the potentially-visible view), and as the use of programmable fragment shading increases (as the cost of shading a given fragment using programmable fragment shading is relatively greater).
The problem of “overdraw” could be significantly reduced by sending primitives for rendering in front-to-back order. However, other graphics processing requirements, such as the need for coherent access to resources such as textures, and the need to minimise the number of API calls per frame, generally mandate other preferred ordering requirements for primitives. Also, a full front-to-back sort of primitives prior to rendering may not be practical while still maintaining a sufficient throughput of primitives to the graphics processing unit. These and other factors mean that front-to-back ordering of primitives for a given render output, e.g., frame, is generally not possible or desirable in practice.
A number of other techniques have therefore been proposed to try to reduce the amount of “overdraw” (the amount of redundant processing of hidden surfaces) that is performed when processing a render output, such as a frame for display (i.e. to avoid rendering non-visible primitives and/or fragments, etc.).
One such technique is to carry out forms of hidden surface removal before a primitive and/or fragment is sent for rendering, to see if the primitive or fragment etc. will be obscured by a primitive that has already been rendered (in which case the new fragment and/or primitive need not be rendered). Such so-called “early” hidden surface removal may comprise, for example, early occlusion culling, such as early-Z (depth) and/or stencil, testing processes (and is in addition to the “late” hidden surface removal, such as late depth testing that will take place after the rendering process).
These arrangements try to identify, e.g., fragments that will be occluded by already processed primitives (and therefore that do not need processing) before the later fragments are issued to the rendering pipeline. In these arrangements, the depth value, e.g., of a new fragment to be processed is compared to the current depth value for that fragment position in the depth buffer to see if the new fragment is occluded or not. This can help to avoid sending fragments that are occluded by already processed primitives through the rendering pipeline.
However, these “early” (prior to rendering) hidden surface removal techniques can still suffer from inefficiencies.
For example, a later graphics fragment for a given sampling position in the render output being generated may only be able to be tested (e.g. depth tested) when an earlier graphics fragment (that is already being processed) for that position in the render output has completed its processing (so as to allow all the required information for testing the later graphics fragment to be available in the, e.g., depth buffer). When such a “dependency” occurs, the later graphics fragment could either be stalled at the early hidden surface removal test stage until the earlier graphics fragment or fragments (that preceded it into the graphics processing pipeline) have completed their processing, or the early hidden surface removal (e.g. depth) test could be skipped for the later graphics fragment, and that fragment simply issued to the rendering pipeline regardless (and then tested at the late hidden surface removal stage when it reaches the end of the rendering pipeline).
However, both of these arrangements can lead to inefficiencies. For example, in the former case, there may be a delay in processing and throughput of the graphics fragments. In the latter case, graphics fragments that would in fact have been occluded will be issued to the rendering pipeline and processed.
The Applicants believe therefore that there remains scope for improved techniques for hidden surface removal in graphics processing systems.
Like reference numerals are used for like components where appropriate in the drawings.