Tessellation is used in computer graphics to convert low-detail subdivision surfaces into higher-detail primitives. Tessellation breaks up high-order surfaces into suitable structures for rendering. This approach allows a graphics pipeline to evaluate lower detail (lower polygon count) models and render in higher detail. That is to say, a surface defined by a high-order equation (e.g. cubic or quadratic) is divided into a plurality of flat primitives, typically triangles for rendering.
High order surfaces are well known within the computer graphics industry and are often referred to as “patches”. These patches are functions of polynomial equations that typically define a set of control points describing the shape of a curve in terms of the parametric relationships between a variable ‘t’ (for a curve that is plotted in two dimension) or two variables ‘u’ and ‘v’ (for a curved surface in three dimensions). FIG. 1 illustrates a Bezier patch which is an example of a high order surface type commonly used within 3D computer graphics. A point ‘P’ 100, on Bezier surface 110 is defined by the function of the domain co-ordinates u,v 120 (also known as the parametric co-ordinates) and the corresponding control points ki,j 130.
      P    ⁡          (              u        ,        v            )        =            ∑              i        =        0            n        ⁢                  ∑                  j          =          0                m            ⁢                                                  Au              i                        ⁡                          (                              1                -                u                            )                                            n            -            i                          ⁢                                            Bv              j                        ⁡                          (                              1                -                v                            )                                            m            -            j                          ⁢                  k                      i            ,            j                              Where A and B are constants defined as:A=n!/i!(n−i)! and B=m!/j!(m−j)!
It should be noted that values of P(u,v) lie within the volume 140, also known as the convex hull which is described by control points ki,j 130. It should also be noted that the Bezier patch is only an example of one possible surface formulation and that there are many other possibilities which are used in computer graphic systems.
Tessellation is a well known technique that subdivides high order surfaces/patches such as that shown in FIG. 1 into a number of adjoined primitives lying on the plane of and within the boundaries of the original surface. The subdivision scheme of the tessellator is performed in the domain of the patch, typically using a normalized (zero-to-one) coordinate system. The consequence of this is that the tessellation process is independent of any curvature present in the final surface. The domain of the patch may be a quad, triangle or line and these domains are typically subdivided into many smaller primitives such as points, lines or triangles. These primitives are defined by the interconnection of domain points, whose locations are defined by the tessellation method and settings.
FIG. 2 illustrates tessellation of the domain points for a Bezier quad patch using a binary sub-division method. The domain 200 with 16 domain points and 0.25 intervals on each axis represents the minimum number of domain points within a tessellated patch, this being the same as the number of control points required to define a Bezier surface. One level of tessellation is applied at 210 resulting in a further set of domain points being generated at intervals that lie at the mid points between the original set of points. A second level of tessellation 220 introduces a further set of points at the midpoint between the points generated at 210. This process is repeated until a suitable level of tessellation is achieved.
The level of tessellation is controlled by the tessellation application and may be determined by visual quality metrics such as how many polygons are required to give a smooth representation of a curved surface at a particular distance from the camera. Alternatively the level of tessellation applied may be determined by available computational power, with more powerful systems using higher levels of tessellation to improve visual quality. It should be noted that binary sub-division represents only one possible tessellation method and is presented here only as an example.
Microsoft's Direct3D11 (D3D11) application programming interface (API) introduces a programmable alternative to binary sub-division for tessellation within a graphics pipeline which will be used for illustration in this document. Other API's, such as OpenGL 4.0, provide similar functionality for the tessellation of high order surfaces. These programming interfaces are often accelerated by hardware. FIG. 3 illustrates the graphics pipeline required by the D3D11 API. A Vertex Shader stage 300 takes a set of individual control points for a patch and applies an arbitrary mathematical transform using programmable hardware in a manner which will be well known to those skilled in the art. The transformed control points are then passed to a Hull Shader 310 which calculates tessellation factors for the edges of the patch and applies further application defined modifications to the control points.
The edge tessellation factors for the patch are passed to a Tessellation Unit 320. Tessellation occurs in two parts, domain tessellation and connectivity tessellation. Domain tessellation subdivides the patch into a number of points known as domain points. The location of these domain points is determined by the tessellation method and supplied tessellation parameters in a similar manner to that described for FIG. 2, but methods such as those prescribed by the D3D11 API do not require the domain points to be placed at regular (e.g. power of two) intervals. Connectivity tessellation determines how the resulting domain points are combined to produce tessellated primitives according to a fixed function algorithm whose operation is defined by the D3D11 API. Detailed description of the fixed function domain and connectivity tessellation algorithms are beyond the scope of this document and are fully defined by the APIs, as will be known to those skilled in the art.
The tessellated domain points are passed to a Domain Shader 330 which combines them with the control points of the patch produced by the Hull Shader in a programmable manner. Typically the Domain Shader would apply a well known curved surface formulation such as a Bezier patch (as described above for FIG. 1) resulting in the domain points being mapped onto the patch to create primitive vertices. The resulting vertices may then be further modified using well known techniques such as displacement mapping. Displacement mapping is a technique in which the results of high order surface tessellation are displaced by a mathematical function or a height that is sampled from a texture map. The introduction of displacement mapping of vertices from a patch surface introduces the possibility that the vertices no longer reside within the convex hull defined by the control points of the patch.
For illustrative purposes embodiments of the invention will be described in terms of tessellation within a tile based rendering system although the methods disclosed are equally applicable to non tile based systems. Tile based rendering systems are well-known. These architectures subdivide an image into a plurality of rectangular blocks or tiles. FIG. 4 illustrates the architecture of a typical tile based system including texturing and shading stages.
Tile based rendering is generally split into two phases, the first of which is known as the Geometry Processing Phase 490 which performs the following operations.
First, a Primitive/Command Fetch Unit 400 retrieves command and primitive data from an external memory and passes this to a Vertex Shader Unit 401. The primitive and command data is then transformed into screen space using well-known methods such as clip/cull and projection shown in blocks 402 and 403.
This data is then supplied to a Tiling Unit 410 which inserts object data from the screen space geometry into object lists for each of a set of defined rectangular regions or tiles. An object list for each tile contains primitives that exist wholly or partially in that tile and is stored in the Tiled Screen Space Geometry Buffer 420. An object list exists for every tile on the screen, although some object lists may contain no data. It is similarly possible to envisage a system where instead of transforming the object into the screen's coordinate system for tiling, the tile could be transformed into the object's coordinate system and tiling be performed in this domain. Any common coordinate system between the tile and the geometry could be used for this purpose.
The second phase of tile based rendering is called the rasterization phase 491 which performs the following operations.
The object lists generated by the tiling unit are read from the Tiled Screen Space Geometry Buffer by a Tile Parameter Fetch Unit 430 which supplies them tile by tile to a Hidden Surface Removal (HSR) unit 440. The HSR unit removes surfaces which will not contribute to the final scene (usually because they are obscured by another surface) by processing each primitive in the tile and passing only data that is visible to a Shading Unit 450.
The Shading Unit takes the data from the HSR unit and uses it to fetch textures using the Texturing Unit 460 and applies shading to each pixel within the visible object using well-known techniques. The Shading Unit then feeds the textured and shaded data to an on-chip Tile Buffer 470. As operations are applied to an on chip Tile Buffer the amount of data traffic required external to the chip is minimized.
Once each tile has been completed, the resulting data is written to a Rendered Scene Buffer 480.
Conventional systems without tessellation typically approximate geometry, such as curved surfaces, with only a limited number of polygons in order to maintain acceptable performance. The most common application of tessellation is to use the patch as a compact representation of a larger number of polygons than would otherwise have been feasible. The more compact representation gives a substantial benefit in reducing bandwidth at the Primitive/Command Fetch (400) stage.
Tiling of a group of primitives, such as those generated from a patch, is often performed by tiling a bounding box for the group. This can require considerably less computation than would be required for tiling each primitive separately. The convex hull often forms the basis of a convenient bounding box for the primitives generated from a surface patch, however, as displacement mapping allows the primitive vertices to extend beyond the bounds of the patch's convex hull, this is not always an appropriate method for use with the D3D11 API.
The conventional method of adding programmable tessellation of high order surfaces to a tile based rendering architecture is to first tessellate every patch in the scene into primitives and apply any displacement mapping. The resulting primitives can subsequently be tiled and processed with a conventional tile based rendering pipeline. While simple, this approach requires the expansion of patch data into a large number of primitives during the early stages of the graphics pipeline. This data expansion causes a significant increase in both memory and bandwidth requirements associated with the tiled screen space geometry buffer (420), negating any benefit achieved in the primitive/command fetch, and putting the tile based system at a disadvantage compared to a non-tiled architecture that does not use a tiled screen space geometry buffer. It is therefore desirable to avoid full data expansion prior to tiling and instead store patches and to utilise a method of generating primitives from those patches.
A more efficient method of adding tessellation of high order patches to a tile based rendering architecture is described in our British Patent application no. 1007348.3 and shown schematically in FIG. 5. In this improved two-pass system, the Geometry Processing Phase 590 of the rendering pipeline is expanded to determine whether polygons that are created during tessellation of a patch exist wholly or partially within a tile. A polygon is said to exist within a tile if its final position in the scene lands within the viewable region of that tile. A polygon exists wholly within the tile if its position in the scene falls entirely within the viewable region of a tile. A polygon exists partially within the tile if at least some of the polygon falls within the viewable region of a tile. Polygons that exist wholly or partially within a tile may be wholly or partially visible within the tile or may not be visible at all in the final scene if they are obscured by other polygons. Optionally, polygons that are obscured may be detected and removed from the list of polygons that exist wholly or partially within the tile. If one or more polygons from a patch exist wholly or partially within a tile, patches, rather than polygons are stored in the object lists for the tiles. Note: if only a small number of polygons are visible from the patch it may be desirable to store the polygons to minimize the amount of data. This process of allocating geometry that exists within a tile is generally called “tiling” or “binning”.
An initial tessellation pass is performed on the patch data and the locations of the tessellated primitives from each patch are determined by performing vertex shading and hull shading. If any of the primitives generated by the tessellation of a patch exist within a tile then that patch must be added to the object list for that tile in the Tiled Screen Space Geometry Buffer. By storing only the set of patch control points and the tessellation parameters required to redo the tessellation in the rasterization phase, the amount of data stored in the Tiled Space Geometry Buffer is significantly reduced when compared to the amount of data that would have been required to store the post tessellation primitive geometry. It should be noted that the patch control points may be stored after vertex and hull shading have been performed to avoid repeating those parts of the tessellation calculation during rasterization.
As complete tessellation has been performed during the tiling process it is also possible to store a list of which primitives from the patch will exist wholly or partially in the tile after tessellation. This list allows the possibility of recreating from each patch only those primitives that are required in a tile during later tessellation processing.
The Vertex Shader Unit 501 and the Hull Shader Unit 502 operate as described above in a standard D3D11 tessellation pipeline. The Hull Shader Unit passes the calculated edge tessellation factors to both the Domain Tessellation Unit 503 and the Connectivity Tessellation Unit 504, while also passing the processed control point data to the Domain Shader Unit 505.
The Domain Tessellation Unit generates domain points with associated domain point index values and the domain point Connectivity Tessellation Unit specifies a list of domain point indices from the patch which specify primitive indices and the order in which they should be connected to generate the primitives.
The primitive vertices are passed to an optional Cache Unit 506 which caches primitive vertices previously generated by the Domain Shader Unit. It should be noted that the cache is not required but the interconnected nature of the primitives that make up the tessellated patch mean that the presence of a cache can significantly reduce the number of primitive vertices that are processed through the Domain Shader unit. Where a primitive vertex is not present within the cache it is requested from the Domain Shader Unit.
The Domain Shader Unit processes only the position part of the primitive vertex data as it is the only part required in order to tile tessellated geometry. The Cache Unit passes on the primitive vertices that make up the primitives to the Clipping and Culling Unit 510 which removes any primitives outside of the visible region of the screen and optionally removes back facing and/or very small primitives that fall between sample positions. It should be noted that clipping and culling operations disrupt the regular ordering of primitives generated from tessellated patches, and that this may affect the performance of subsequent compression operations. Therefore, clipping and culling may optionally be deferred until a later stage in the pipeline.
Any remaining primitives are passed to the Projection Unit 511 that transforms the remaining primitives/primitive vertices into screen space to be tiled by the Tiling Unit 512. The Tiling Unit determines which primitives exist wholly or partially in each tile and passes a list of the domain point indices describing the primitive vertices to an optional Index Compression Unit 513.
Embodiments of the present invention are described with reference to the method of compression used in the Index Compression Unit 513. This unit is responsible for efficiently compressing the list of domain point indices used to define the primitives that exist in each tile into a compressed primitive list. This compression reduces the amount of data written to the per-tile geometry lists, decreasing storage and bandwidth requirements.
A reference to the Hull Shader output and the compressed primitive lists are then written into the Tiled Screen Space Geometry Buffer 514.
Later phases in the pipeline can use the Hull Shader output to regenerate only those primitives that exist wholly or partially within the current tile. Primitives from a tessellated patch that do not exist wholly or partially within the current tile need not be created by the tessellation process.
The present invention presents an efficient compression scheme for the list of domain point indices that arise from the tiling tessellated geometry.
Several compression schemes already exist to reduce the amount of data required to store geometry. The most common of these is the use of triangle strips. A triangle strip is an efficient method of describing connected triangle primitives that share vertices. After the first triangle is defined using three vertices, each new triangle can be defined by only one additional vertex, sharing the last two vertices defined for the previous triangle primitive. FIG. 6 illustrates an example triangle strip consisting of four connected triangle primitives. Without using triangle strips, the list of indices would have to be stored as four separate triangles: ABC, CBD, CDE, EDF. However, using a triangle strip, these can be stored as a sequence of vertices ABCDEF. This sequence would be decoded as a set of triangles ABC, BCD, CDE, DEF then every even-numbered triangle (with counting starting from one) would be reversed resulting in consistent clockwise/anti clockwise ordering of the original triangle primitives.
Triangle strips are able to effectively compress long runs of adjacent triangles but are less effective with more complex geometry. Geometry generated by the tiling of tessellated patches contains structures that reduce the effectiveness of triangle strip compression such as multiple triangles sharing a common vertex, known as a triangle fan, changes in the clockwise/anticlockwise vertex ordering of the triangle primitives and missing triangles in the sequence resulting from the tiling process. FIG. 7 illustrates a typical low level tessellation of a triangular patch into triangular primitives. It can be seen that the tessellated geometry contains multiple triangle fans and multiple changes in triangle ordering that are unsuitable for effective compression using triangle strips. Removing some of the triangle primitives from the tessellated patch (as a result of tiling) could further reduce the effectiveness of triangle strips as a compression method.
Vertex Buffers and Index Buffers are also well known in computer graphics. An illustrative example of Index Buffer compression is shown in FIG. 23 as a well known method for compressing a stream of indices. Lists of indices from recently seen primitives are stored in an Index Buffer 2320. When a new primitive is added, the uncompressed Domain Point Indices of the primitive 2310 are compared 2330 with the contents of the Index Buffer. Any index value already present in the Index Buffer can be represented by a buffer position in place of storing the full index value. Assuming the position in the Index Buffer can be described by fewer bits than the index value itself, compression is achieved 2390.
Consider a worked example of FIG. 6 compressed using an index buffer. An initial primitive ABC is read in. None of these indices are present in the buffer before the primitive is encountered. These indices must therefore be stated fully and are used to initialize the buffer. After initialization buffer position 0 contains index A, buffer position 1 contains index B, buffer position 2 contains index C. A square bracket notation indicating the position in the buffer may be used: [0]=A, [1]=B, [2]=C. The second primitive CBD can now be defined as two index buffer positions and newly encountered index D: [2][1]D. Once the buffer is full, newly arriving indices may be introduced to the buffer with a replacement policy such as first-in first-out (FIFO) or least recently used (LRU). Vertex/Index Buffers present an effective compression method for previously encountered indices but each primitive typically requires at least one index that isn't present in the buffer from earlier primitives.
Specific to the compression of tessellated geometry, Jon Hasselgren, et al. proposed an alternative method of determining which polygons may exist within each tile using a bounding box approach (“Automatic Pre-Tessellation Culling”, ACM Trans. on Graph., Vol. 28, pp 19 (2009)). In this approach, interval mathematics is used to determine, for each patch, the range of possible displacements from the base patch that the tessellated surface could take if the tessellation were to be performed. Using this range of possible displacements a conservative bounding box can be created around a patch describing all of the possible locations of primitives from the patch. This bounding box can then be projected and tiled to determine in which tiles the patch may exist either wholly or partially after tessellation. While this method alleviates the need to store the expanded primitive geometry for the tessellated patch, the conservative nature of bounding boxes leads to patches being included in tiles where they will not exist either wholly or partially after tessellation. Furthermore, a bounding box for a patch is only capable of specifying that the tessellated primitives subdivided from a patch might exist inside a tile after tessellation is performed and means that unnecessary patches will be included in some tiles. These unnecessary patches result in wasted computation and rendering later in the graphics pipeline. The bounding box method also cannot identify which primitives from the tessellated patch exist within each tile and therefore performs a less optimal tiling than a method that considers each primitive from the tessellated patch individually. The calculation of tight positional and normal bounds reduces the size of the bounding box of the tessellated primitives and therefore can be used to reduce the number of tiles that are included. However, the calculation of bounds for a tessellated patch after displacement by a function and optional further displacement mapping processes is a computationally intensive process, and the result will tend towards infinite bounds for complex user programmable functions that include terms which are difficult to bound—such as random noise functions.