In a computer-generated image, there are typically a large number of individual polygons. Graphics rendering hardware, in particular 3D graphics hardware, often only has capability for the rendering of triangle primitives or, occasionally, other convex polygons, that is to say, polygons in which all the internal angles of the polygon are less than 180°. Such polygons are relatively straightforward to render. Such specifications include ‘fill rules’ which determine which parts of an arbitrary polygon are to be deemed interior and which are exterior. SVG defines two such rules—‘even-odd’ and “non-zero”. For brevity in this document, we will usually assume use of the ‘even-odd’ rule but it will be clear to one skilled in the art that the techniques presented apply to other well-defined fill rules.
FIGS. 1a to 1e show some examples of ‘arbitrary’ polygons (including concave polygons, polygons with self-intersections and multiple contour polygons). FIGS. 1a and 1e each show an example of a concave polygon, that is to say, a polygon in which at least one of the internal angles is greater than 180°. FIG. 1b shows an example of a polygon with self intersections, that is to say, a polygon in which not every part of each line segment between two vertices remains inside or on the boundary of the polygon. FIG. 1c shows an example of a polygon with multiple contours, that is to say, a polygon with a hole requiring an external and an internal contour to define the overall shape. FIG. 1d shows an example of a polygon including all these features.
The ability to render such arbitrary polygons, whilst also supporting convex polygons, is useful for a number of reasons, for example, to support vector graphics standards such as SVG (Scalable Vector Graphics) and OpenVG (Open Vector Graphics). SVG is a language for describing two-dimensional graphics and graphical applications in XML (Extensible Markup Language). OpenVG is a royalty-free, application programming interface (API) designed for hardware-accelerated 2-dimensional vector graphics. Naturally, any method able to render the arbitrary polygons, must also be able to handle convex polygons.
There are two families of methods with the capability of rendering arbitrary polygons on such hardware. The first family performs calculations in model space and are generally referred to as triangulation algorithms. These take a polygon outline and produce a set of non-overlapping triangles that exactly cover the filled area of the original polygon. To avoid confusion with other uses of “triangulation” in this document, we will refer to such algorithms as “true triangulation”. An example of the possible results of such a process, as applied to the polygon of FIG. 1a is shown in FIG. 2. The original shape can thus be constructed from the triangles: {[v2, v3, v5], [v3, v4, v5], [v5, v6, v2], [v6, v2, v7], [v7, v2, v1]}. Assuming a simple polygon with N-sides and that no extra vertices are added (note some algorithms do introduce additional vertices), we will obtain N−2 triangles. Once these triangles are generated, they can easily be rendered on any commodity graphics hardware.
Numerous algorithms for the “true triangulation” process have been published. Lamot and Zalik provide a survey of methods in “An overview of triangulation algorithms for simple polygons” (Information Visualization, 1999, pp 153-158). These documented methods are nearly always restricted to simple polygons such as FIGS. 1(a) and (e), i.e., they may contain neither self-intersections (including repeated vertices) nor multiple contours. Nevertheless, Meister's “ear cutting” (or ear clipping) algorithm and Seidel's method are of interest to this discussion.
Meister's method removes one vertex at a time from a (simple) polygon in such a way that the reduced polygon remains simple. It repeatedly ‘clips an ear’, formed by a triple of consecutive vertices, from the polygon. This algorithm runs in O(n3) time and, although it has been subsequently improved to be O(n2), it is not particularly attractive except for polygons with relatively few vertices.
Seidel's method, on the other hand, runs in O(n log*n) time for simple polygons where log*(n) is defined as . . .
            log      *        ⁢    n    =      {                            0                                                    if              ⁢                                                                                ⁢                                                                              ⁢              n                        ≤            1                                                            1            +                                          log                *                            ⁡                              (                                  log                  ⁢                                                                          ⁢                  n                                )                                                                                        if              ⁢                                                                                ⁢                                                                              ⁢              n                        >            1                              We can thus consider O(n log*n) to be practically O(n) for any reasonable values of n.
As stated above, very few ‘true triangulation’ algorithms have been published that handle arbitrary polygons. Held's method (“FIST: Fast Industrial-Strength Triangulation of Polygons” Algorithmica 30, 4, 563-596) is one of the few exceptions. Although based on ear clipping, additional structures are used to achieve a much better time complexity for simple polygons, but it is not clear to the inventor how it behaves in the presence of self-intersections etc.
The application's inventor has implemented a version of Seidel's algorithm that has been enhanced to support completely arbitrary polygons. This still achieves virtually linear performance (assuming the implicit vertices created by self-intersections are included in ‘n’). However, on a ˜2 GHz CPU, the process still takes an average of 1˜2 μs per polygon vertex. For polygons that will be drawn multiple times over numerous frames, the triangulation results can be cached, and so the pre-processing cost is amortised by the rendering process. For situations, however, where a polygon is only drawn a few times or is being dynamically altered on a frame-by-frame basis—which forces re-triangulation—the true triangulation process can be a very significant penalty. (Note that applying linear transformations to the model does not require re-triangulation.)
The second family of methods with the capability of rendering arbitrary polygons uses image space calculations. Here the rendering/sampling process itself is adapted to determine which pixels fall inside the arbitrary polygon and which are outside. Although this can be done with scan line rendering algorithms, we are primarily interested in those that break the arbitrary polygon into smaller polygons (usually triangles) for which the hardware has direct rendering support, render those smaller polygons and make use of the hardware stencil buffer to determine which of the rendered pixels are inside the original arbitrary polygon. It is well known in the art (“OpenGL programming guide: the official guide to learning OpenGL, version 1.4”, Shreiner et al, ISBN 0321173481) that arbitrary polygons can drawn by using the stencil buffer. For example, one may implement the odd-even rule by applying XOR operations to the stencil. Similarly, provided triangle winding orders are taken into account, increments and decrements of the stencil can be used to implement the non-zero rule.
With either fill rule, one must first produce a set of triangles from the source polygon. The obvious approach is described in the “Drawing Filled, Concave Polygons Using the Stencil Buffer” section of chapter 13 of Shreiner et al (available at either http://fly.cc.fer.hr/˜unreal/theredbook/chapter13.html or http://www.scribd.com/doc/7605395/Redbook). Here a triangle fan (refer Chapter 2 of Shreiner et al or http://en.wikipedia.org/wiki/Trianglefan) is created by simply submitting the vertices in order, i.e. [v1, v2, v3, . . . vN] which implicitly creates the set of N−2 triangles with vertices' {[v1, v2, v3], [v1, v3, v4], [v1, v4, v5], . . . [v1, vN-1, vN]}.
Borrowing the example (FIG. 1 (a)) from Shreiner et al, part of this process is shown in FIG. 3. The seven sided figure is rendered as a fan of five triangles. Assuming the odd-even fill rule, the pixels of the screen which are covered by an odd number of triangles will be deemed interior while those covered by an even number will be exterior. For example, the area bounded by v1, v3, and location 20, is covered by triangles [v1, v2, v3] and [v1, v3, v4]. Assuming that the stencil buffer is initialised to zero and an XOR operation employed, drawing triangle [v1, v2, v3] will first set all the pixels' stencil values for region [v1, v3, “20”] but these will subsequently be cleared again by triangle[v1, v3, v4]. The region will thus be correctly deemed exterior to the polygon. The simplicity of this process is extremely appealing and, since it uses a triangle fan, it only requires the transmission of N vertices to the graphics hardware.
Once the stencil has been set to indicate which pixels are inside the polygon, it must be filled with the appropriate colours or textures. Methods to do this include computing the 2D bounding rectangle of all the vertices of the polygon and then drawing just a single rectangle (with stencil test), or to simply resend the generated triangles. The former, as applied to FIG. 1(a) and illustrated in FIG. 15(a), has the advantage of sending a near minimal amount of geometric data to the hardware but requires pre-computation of min and max bounds. It also can be expensive, in terms of wasted pixel processing, if the rectangle does not tightly bound the polygon to be filled, as shown by the region 415.
Another method, as shown in FIG. 15(b),—in this example using a set of triangles generated using the invention's method (refer FIG. 7)—sends more geometry than the bounding box method but generally results in less redundant pixel filling. In this example, much of the shape is filled with a single ‘layer/pass’ of pixels, 450, but there are regions where pixels are filled multiple times, 460. This typically becomes worse with polygons with greater numbers of regions of, or total area of, concavity.
The method also works unaltered for self-intersecting and multiple contour polygons—Shreiner et al also provide an example of the latter. In effect they just concatenate all the vertices of all the contours and treat the result as larger triangle fan.
Despite the pleasing simplicity of this fan method, as described in the art, the inventor has appreciated that it has two fundamental problems. The first is related to the shape of the generated triangles. Producing a fan of triangles from the original polygon tends to lead to the production of long, thin triangles. Such a triangle is generally slower to render with graphics hardware than another that has an equal screen area but is ‘more equilateral’ in shape. One publication, “Silhouette clipping”, (Sander et al, SIGGRAPH 2000, pages 327-334) discusses this problem and gives a partial solution. Sander et al also need to fill sets of contour edges. These are, in effect, polygons and are likely to have concavities. They state:                “The basic algorithm described so far tends to draw many long, thin triangles. On many rasterizing chips (e.g. NVIDIA's TNT2), there is a large penalty for rendering such eccentric triangles. It is easy to show that the setStencil algorithm behaves best when the screen-space projection of q has a y coordinate at the median of the contour vertices. Choosing q as the 3D centroid of the contour vertices serves as a fast approximation”.. . . and . . .        “Each edge contour is drawn as a fan of triangles about an arbitrary center point, which we choose to be the 3D centroid of the contour vertices.”        
This typically does improve the shape of the triangles but unfortunately introduces an extra point, which thus requires the data to be read twice. It also creates an additional triangle in the fan. An example of the results of their process, as applied to FIG. 1 (a), is shown in FIG. 4. (Please note that the ‘centroid’ location, Vcentroid, is only an approximation in this illustration).
Sander et al suggest a further improvement:                “To further reduce the eccentricity of the fan triangles, we break up each large contour into a set of smaller loops. More precisely, we pick two vertices on the contour, add to the data structure two opposing directed edges between these vertices, and proceed as before on the smaller loops thus formed.”        
This is, unfortunately, quite vague. Firstly, they don't say how “we pick two vertices”. Secondly, in the context of the paper, “proceed as before on the smaller loops” would appear to imply the process of computing the ‘centroid’ of each loop and turning each into a fan. That does not seem correct as it would only produce two child loops.
A more likely interpretation is that they have a target, M, for the number of vertices per child ‘loop’ and divide the source polygon into sufficient equal pieces to meet that target number. An N-vertex source polygon would thus require P child polygons where P=└N/(M−1)┘. With their scheme, if the source polygon is thus divided into P sections, then P additional vertices (each located at the centroid of its respective ‘loop’) are introduced. It should be noted that, since each child loop is drawn with a fan, there are practical reasons—described in the following paragraph—for not choosing too small a value for M.
Also of relevance to the invention are the methods by which contemporary rendering hardware reduces the triangle data and bus bandwidth when models are supplied to the rendering hardware. The simplest method is to supply each triangle as three, V-byte vertices so that, for a model with T triangles, 3*T*V bytes of data would be transmitted to the hardware, but more efficient options exist. We have already seen that 3D hardware typically supports the concept of triangle fans, whereby a T triangle fan only needs to supply (T+2)*V bytes of data. For 3D models, a related concept called triangle strips (again see Shreiner or http://en.wikipedia.org/wiki/Trianglestrip), is typically more useful. Like triangle fans, these also require only (T+2)*V bytes of data for a strip of T triangles. In both cases, the ratio of triangles to vertices climbs asymptotically towards 1:1 as the length of the strip or fan increases. Longer strips and fans are thus more efficient.
Over the past decade, an indexed triangle format has been seeing increased popularity as a means of further decreasing the bandwidth and storage costs. Here each triangle is defined as three integer indices, each say of 16 or 32 bits, which select vertices from a vertex array. With 3D models, this format offers the opportunity to exceed the 1:1 barrier of strips and fans, though this is unlikely for 2D shapes. To efficiently support such a format, graphics hardware now frequently employs a vertex caching technique such as that described by Hoppe (“Optimization of mesh locality for transparent vertex caching”, Computer Graphics (SIGGRAPH '99 Proceedings) pp 269-276). In brief, the hardware maintains a cache of the last K vertices used in past triangles. A FIFO replacement policy is generally employed rather than, say, a Least Recently Used (LRU) scheme as the former is not only simpler but, more importantly, generally results in a higher cache hit rate for 3D models.
We now return to the second, and probably far more significant problem with the prior art fan algorithm, which is that it can require a disproportionate amount of “pixel filling”. For example, one can see from FIG. 3 that there is a relatively large area which is covered by multiple triangles compared to the ideal situation of FIG. 2 as generated by a ‘real triangulation’ method. We will refer to the areas covered by multiple triangles as ‘overdraw’. This overdraw is an undesirable burden in the rendering phase and it is advantageous to reduce it if possible.
On average, simply using the less-obvious triangle strip order i.e. outputting the vertices in the order v1, v2, vN, v3, vN-1, v4 . . . and thus producing the triangles {[v1, v2, vN], [v2, vN, v3], [vN, v3, vN-1] . . . } often results in both better shaped triangles and lower overdraw compared to fan order (although, ironically, not in the particular case FIG. 1 (a)), but the improvement is unfortunately not that great. From FIG. 4, Sander et al's method would also appear to reduce overdraw at the expense of introducing an additional vertex and triangle, but it certainly does not work in all cases. Applying their method to FIG. 1 (e), where the centroid would be located in the centre of the “U”, would result in significant regions of overdraw, as shown in FIG. 5.
The inventor has appreciated that there is a need for a method of producing a set of simpler polygons (usually, but not always, triangles) from an arbitrary polygon for rendering with a stencil buffer method which:                Avoids pre-processing of the polygon data—(if the pre-processing becomes expensive one might as well use a true triangulation algorithm.)        Does not introduce additional vertices.        On average, produces lower overdraw rates than the fan (or strip) methods.        On average, produces ‘more equilateral’ shaped triangles than the fan/strip methods.        Is simple to implement in both software and hardware and uses relatively few operations. It must, of course, be O(n).        
It is an object of the present invention to provide a method and system that goes some way towards achieving the above goals.
In addition, for any method and system, the following features, though not necessarily essential, are desired:                Geometry data transfer costs that are approximately equivalent with the fan/strip methods.        The supplied primitives, e.g. triangles, should, preferably, be arranged in an order so that “chronologically close” primitives are also close in screen space so that caching (e.g. frame buffer caching) can be more effective.        A method should not be substantially more complex than the fan method.        