This relates generally to graphics processing.
In graphics processing, sort-middle or tiling architectures may be advantageous with respect to power consumption.
A sort-middle architecture works by first performing vertex processing so that the screen space positions on each processed triangle are known. The screen space pixels are divided into non-overlapping tiles that usually are rectangles of pixels. Each tile has a list of triangle pointers that point to triangles overlapping the tile. These tile lists are created by processing all triangles, and adding a triangle pointer to each tile that overlaps a particular triangle.
When all of the triangles have been processed and all the tile lists have been fully constructed, each tile can then be backend processed. This means that the triangles from the tile are rasterized to that tile. One advantage of tiled processing arises from the fact that a processing core can work on a tile independently of other cores and tiles, enabling relatively straightforward parallel processing in the backend. In addition, since a core only works on one tile at a time, the frame buffer memory for that tile can be relatively small and relatively fast local memory. In general, this may save memory bandwidth because all frame buffer accesses can be performed on this local memory. The tile is loaded and stored from external memory once per tile.
One disadvantage involves the need to create tile lists and the associated increased memory bandwidth. Also, in rasterizing the tile, the tile lists need to be read again. In addition, all the tile lists are kept in memory at the same time, which means there is a risk of running out of memory. In these cases, the architecture needs to detect when it runs out of memory and then rasterize a first part of the triangles while spilling current frame buffer content to external memory and then finishing the processing of all tiles. Then the remaining triangles are processed in the same manner, except that the same buffer content is read in from external memory before rasterization can begin. The spilling process is cumbersome and consumes memory bandwidth, slowing down processing.