The present technology described herein relates to data processing systems, and in particular to arrangements for the execution of graphics processing operations in a graphics processing unit of a graphics processing system.
Graphics processing is typically carried out in a pipelined fashion, with one or more pipeline stages operating on the data to generate the final render output, e.g. frame that is displayed. Many graphics processing pipelines now include one or more so-called “shading” stages, commonly referred to as “shaders”. For example, a graphics processing pipeline may include one or more of, and typically all of, a geometry shader, a vertex shader and a fragment (pixel) shader. These shaders are processing stages that execute shader programs on input data values to generate a desired set of output data (e.g. appropriately shaded and rendered fragment data in the case of a fragment shader) for processing by the rest of the graphics pipeline and/or for output.
A graphics “shader” thus performs graphics processing by running small programs for each graphics item in a graphics output to be generated, such as a render target, e.g. frame (an “item” in this regard is usually a vertex or a sampling position (e.g. in the case of a fragment shader)). This generally enables a high degree of parallelism, in that a typical render output, e.g. frame, features a rather large number of vertices and fragments, each of which can be processed independently.
In graphics shader operation, each “item” will be processed by means of an execution thread which will execute the shader program in question for the graphics “item” in question.
Modern graphics processing units (graphics processors) accordingly typically comprise one or more programmable execution units that can execute shader programs to perform graphics processing operations, together with one or more graphics-specific accelerators (processing units), such as a varying interpolator, a texture mapper and a blender. These graphics-specific accelerators perform specific graphics processing operations, such as varying interpolation, texture mapping and blending, under the control of the programmable execution unit.
FIG. 1 shows schematically such an arrangement of a graphics processing unit 101. As shown in FIG. 1, the graphics processing unit 101 includes a rasteriser 102, a thread spawner 103, a programmable execution unit 104, a varying interpolator 105, a texture mapper 106, and a blender 107.
The programmable execution unit 104 executes graphics shading programs, such as fragment shading programs, to perform graphics processing operations, such as (and in particular) shading (rendering) fragments generated by the rasteriser 102.
As part of this processing, and as shown in FIG. 1, the programmable execution unit 104 can call upon the varying interpolator 105, the texture mapper 106 and the blender 107 to perform specific graphics processing operations. To do this, the programmable execution unit will send appropriate messages to the relevant accelerator (and receive the appropriate response therefrom), e.g. in response to specific instructions in a shader program that it is executing.
The varying interpolator 105 operates to interpolate values across graphics primitives, and, as part of this operation, often creates texture coordinates to be used for sampling graphics textures.
The texture mapper 106 operates to sample graphics textures using texture coordinates, e.g. generated by the varying interpolator 105, and produces therefrom a filtered texture sample result (which it can then return to the programmable execution unit 104 for use, e.g. when shading sampling points).
The blender 107 operates to blend, e.g., fragment shading results generated by the programmable execution unit with previously generated fragment shader results, such as results that are already stored in the tile buffer (in the case of a tile-based graphics processing unit) and/or the frame buffer.
In operation of the graphics processing unit 101 shown in FIG. 1, the rasteriser 102 will rasterise graphics primitives to be processed to produce graphics fragments to be rendered (shaded).
The graphics fragments generated by the rasteriser 102 are then provided to the thread spawner 103. (The fragments may also be subject to various tests, such as depth and stencil tests, before being provided to the thread spawner 103, with only those fragments that pass all the relevant tests being provided to the thread spawner 103.)
The thread spawner 103 operates to spawn execution threads for execution by the programmable execution unit 104 for the fragments that it receives. It will, e.g., determine which shader program is to be executed for a fragment, and then spawn a thread or threads to execute it.
The programmable execution unit 104 then executes the appropriate shader programs for each thread that it receives from the thread spawner 103 to process the fragments generated by the rasteriser (and that pass the necessary tests) to produce the desired render output.
Once the programmable execution unit has finished its processing for a given fragment (including using the responses, if any, from the varying interpolator 105, texture mapper 106 and blender 107), the resulting shaded fragment (sampling position) values can be written out to memory, e.g. for output.
The Applicants believe that there remains scope for improved arrangements for graphics processing units that include both a programmable execution unit and graphics-specific accelerators.
Like reference numerals are used for like components where appropriate in the drawings.