Tessellation is becoming more and more utilized by game companies and in certain graphics processing units (GPUs). In some applications, it is sufficient to have a single geometry and setup fixed-function pipeline (GSP). To obtain higher performance and to extract more parallelism in geometry processing, larger architectures generally have several GPSs working in parallel because attempting to increase the performance and throughput of a single GSP quickly may become infeasible. Applications that utilize high-performance geometry processing are, for example, ones that use tessellation or shadow mapping where the pixel shader is disabled. These architectures maintain a strict in-order processing of the triangles or patches even when using multiple GSPs and multiple rasterization pipelines (RPs).
Some problems may occur, however, with such an arrangement since a high-bandwidth rasterization crossbar is utilized to sort the geometry to the different RPs and GSPs in order to output data to the crossbar in the same sequence as it was submitted to the GPU. Because GPSs generally work in parallel, a significant buffer is required after each GSP. The greater the number of GSPs utilized, the larger the buffers need to be. If there are not sufficient buffers, then a GSP may become idle. A GSP is capable of processing individual triangles and lines, but also patches that are to be tessellated. Each GSP is capable of handling a single patch, and a patch may generate anywhere from zero triangles up to several thousand triangles, which tends to increase the problem with the architecture discussed above due to the significant data expansion.
Tessellation rates will continue to increase, and when a user zooms in on a character or object in a displayed graphical image, multiple patches with high tessellation rates may be visible and hence processed. In this case, each GSP will process one patch at a time, and since strict ordering is to be honored between patches, the buffer needs to be able to hold all the primitives, for example triangles, that a GSP will generate during tessellation. This approach therefore leads to very large buffers. In addition, if one patch has a very high tessellation rate, and a large set of other patches has small tessellation rates, then one GSP may have a lot of work for a long period of time, whereas another GSP might stall or may need to queue up results from several patches.