Modern graphics processing units (GPUs) implement a programmable hardware pipeline, referred to herein as a “graphics pipeline” or a “GPU pipeline,” for rendering real-time 3D graphics. Applications invoke high-level graphics application programming interfaces (APIs), such as Direct3D and OpenGL, to configure this pipeline and to provide shaders, which are programs for performing application-specific graphics or compute operations (e.g., per-vertex processing, per-pixel processing, etc.). Drivers implementing the graphics APIs translate the application-provided API calls and shaders into instructions that are executed by GPU hardware.
In a 3D graphics pipeline, the geometry to be rendered is provided in the form of vertex coordinates. These vertices are assembled into primitives (e.g., points, lines or triangles). However, due to changes in the graphics pipeline, assembling the primitive is an extremely difficult and time intensive task. The changes to the graphics pipeline include the introduction of strips, new primitives (e.g., adjacency primitives), and index buffers with cut-indices, and removal of some primitives (e.g., triangle fans). Further complicating the assembly of primitives, the changes to the pipeline where made in addition to the requirement to correctly preserve winding information between the encoded primitives.
Strips are a method of encoding geometries using less vertex data, reducing memory requirements. For example, 6 vertices in a triangle-strip encode 4 triangles, as compared to a triangle list which would need 12 vertices to create 4 triangles. However, the strips need to be coded before the primitives can be assembled, increasing the complexity of assembling the primitives. The concept of winding corresponds to the ordering of the vertices (e.g., counter or clockwise) and is used to determine whether a primitive is front or back facing. In strips, the winding order is flipped in order to preserve the correct front/back order of primitives, and must be determined when assembling the primitives, adding further complexity.
Removing triangle fans created the requirement to have the corresponding primitives decomposed before entering the graphics API. Further, adjacency primitives, which extend lines and triangles with data of vertices adjacent to them but not part of them, have to be decomposed into regular points, lies, or triangles for rendering.
Further, to allow content creators to use multiple strips in a single draw operation, the concept of a “cut” or “restart” index was introduced. However, this increased the complexity of assembling primitives, introducing additional steps to be completed. For example, when rendering geometry, the index buffer is used. The index buffer is a buffer containing indices into the vertex buffer, allowing vertex data to be reused for multiple vertices. Cut (restart) indices are indices whose value is equal to the maximum representable value for the current buffer. For example, for a buffer of 16 or 32 bit integers, the value may be 0xffff or 0xffffffff. Further, a cut index may be represented by a −1. Cut indices indicate an end of a strip, and that subsequent indices specify a new strip.
The difficulty of decomposition is further increased as vertex data is fundamentally linear. Thus, it is not possible to look at a specific vertex and determine which, if any, triangle the vertex is a part of without also knowing if the index buffer includes cut indices before the vertex.
In view of the complexity of various aspects of the graphics pipeline, various graphics hardware and API approaches removed support for the more complex aspects of the graphics pipeline. However, virtualized graphics APIs are required to support all aspects of the graphics pipeline. The ability to correctly virtualize graphics APIs on top of other APIs hinges on the ability of the virtualized graphics API to be able to efficiently perform primitive (or index) assembly. Thus, there is a need to increase the efficiency of primitive assembly.