The field of computer graphics is a fast changing one, in which trade-offs are made between the quality of the images being produced and the processing power of the computing platform being used. The trade-off becomes even more acute when the images need to be displayed in real-time, for example in the video-gaming industry.
Rendering is the process of generating an image from a model of a scene, where the model is a description of three dimensional objects. The rendering of an image on a computer screen entails vast amounts of processing. For example, it is necessary to have a so-called “global scene” (also known as the “world scene”) of the images to be displayed, and which will be hereinafter referred to as the scene. Broadly speaking, a scene can be thought of as a snap-shot or picture of the image to be displayed on the screen at any instant in time. As would be expected, a scene will itself comprise many different objects each having their own geometry. For example, a global scene might be the interior of a particular room in a house. The room might have windows, furniture, a TV, etc. Each object, for example TV, table, window will have a different geometry and will need to be created with the correct dimensions and co-ordinates on the screen in relation to the other images.
These objects are defined in three dimensions (3D), but will be rendered onto a two-dimensional (2D) computer screen. The technique for rendering a 3D object onto a 2D display involves firstly breaking down the 3D object into polygons defined by primitives. A popular primitive used is a triangle having three vertices. Other primitives can also be used, including points, lines or other polygons. Thus, a 3D image can be transformed into a plurality of, for example, triangles each being defined by a unique set of vertices where each vertex would typically contain information relating to co-ordinates (x, y, z), color, texture and lighting. The data defining the 3D image is continuous vector data. It should be understood that a fairly large storage area is needed to accommodate the vertex information.
The creation of an image on a display is performed by a “graphics pipeline”, which takes a geometrical representation of a 3D scene as an input and outputs a 2D image for display on a computer screen. Broadly speaking, the creation of an image on a computer graphics screen can be thought of as consisting of a geometry stage and a rendering stage. In existing systems, the geometry stage is responsible for transformation and lighting, in which the 3D object is converted into a number of polygons defined by a set of suitable primitives. Consider an interactive computer game where the user controls the motion of his player, as the player moves forward or backward the objects in the frame will need to be transformed so that they appear closer to and further away from the user, respectively.
In the rendering stage the transformed vertices are placed in a frame buffer in digital form. The frame buffer can in fact comprise a number of buffers, e.g. a color buffer, depth buffer, stencil buffer, accumulation buffer. The frame buffer needs to be continuously managed and updated as the frame (or scene) changes. The rendering stage comprises the process of Rasterization. Rasterization describes the conversion from a vector representation to an x-y coordinate representation. This is the process of taking a two-dimensional image described in a vector format and converting it into pixels, where the pixels are the “dots” that make up the computer display (and correspond to the smallest discrete part of an image that can be displayed). The pixels are drawn and the frame buffer stores lighting, color and intensity information for each pixel that will be enabled. The digital frame data is then converted to an analogue signal to be used to provide the final image for the actual 2D computer display.
The problem of “aliasing” in two and three dimensional computer graphics is well known. When an image is rendered, aliasing is a result of the rendering process being a sampling procedure. Continuous vector data, such as the vertex positions of the primitives making up a scene in a three dimensional space, are effectively discretised as they are turned into screen pixels by the rendering process. Smooth polygon edges are drawn onto the display with what are known as “jaggies” because of insufficient pixel resolution. If the 3D images are animated, then the moving edges have “crawlies” as they jump from one pixel to the next, with instantaneous changes of color.
An example of this problem is shown in FIG. 1, in which is illustrated a grid 100 of 8×8 pixels, representing a portion of a display. The display is rendering a representation of a black shape delimited by continuous line 102. In the example shown in FIG. 1, the color of a pixel is determined by taking a sample of the color at the center point of each pixel. As a result of the sampling, pixels that are fully within the black shape delimited by line 102 are colored black, such as the pixel labelled 104. Similarly, pixels outside the area of the black shape are colored white, such as the pixel labelled 106. The pixels at the border of the black shape (i.e. the pixels through which line 102 crosses) are either black or white, depending on the color at the center point of the pixel. For example, the center of the pixel labelled 108 is inside the line 102, and this pixel is therefore colored black. Conversely, the center of the pixel 110 is just outside the line 102, and this pixel is therefore colored white. The result of this is that the representation of the shape on the grid of pixels 100 has a jagged edge (hence “jaggies”).
A way of minimising this problem is to use anti-aliasing techniques. Anti-aliasing techniques are processes applied as a part of the rendering stage which aim to improve the visual quality of the final displayed image. Anti-aliasing techniques can be divided into two distinct classes: edge anti-aliasing (also called per-primitive anti-aliasing) and full screen anti-aliasing.
Edge (or per-primitive) techniques use computations to blend the geometric edges of primitives (i.e. points, lines, triangles, polygons) in order to reduce aliasing effects. Although this method generally requires less computation than other methods, it can also produce lower quality results because it effectively ignores edges introduced by textures or by primitive intersections.
Full-screen anti-aliasing works on every fragment, regardless of its location with respect to the primitive (note: a fragment consists of the (X, Y) coordinates of a pixel on the final display surface, plus a collection of other necessary information such as color, relative depth (the Z coordinate), texture coordinates etc.). This can result in some wasted calculations in areas of continuous color but generally provides better overall results. The present application is primarily concerned with full-screen anti-aliasing.
In general, full-screen anti-aliasing works by generating more information than is necessary for the final displayed image (in a non anti-aliased system) and then re-sampling this data. This is done by taking several samples of data from the continuous vector image per fragment (i.e. per pixel in the display) which are then combined to give the final result. The samples taken of a fragment may also be called sub-fragments. This formula for combining the samples can be expressed as follows:
                              p          ⁡                      (                          x              ,              y                        )                          =                              ∑                          i              =              1                        n                    ⁢                                          ⁢                                    w              i                        ⁢                          c              ⁡                              (                                  i                  ,                  x                  ,                  y                                )                                                                        (                  Equation          ⁢                                          ⁢          1                )            
Where p is the final pixel color, n is the number of samples per pixel, wi is a weighting factor (in the range [0, 1]) and c is the sample color for sample i.
The precise location within the region defined by the pixel from where these n samples or sub-fragments are taken is determined by the sample pattern being used. Different sample patterns allow a trade off between computation and performance. By using more samples per fragment there is an increase in visual quality but this leads to a much higher computational requirement as each sample has to be processed by the rendering stage. Example known sample patterns are described hereinafter.
An illustrative example of full-screen anti-aliasing can be seen with reference to FIG. 2. This shows the same grid of pixels 100 as shown in FIG. 1, which is rendering the same representation of a black shape delimited by continuous line 102. However, the color of the pixels through which the line 102 passes are not only either black or white, as was seen in FIG. 1. Rather, due to the anti-aliasing, these are levels of grey dependent on the samples taken within the pixel and applied to Equation 1. As a result, the image does not have the jagged edges of FIG. 1, and the image as a whole is perceived as being of a higher quality. The pixel sizes of FIGS. 1 and 2 are, of course, greatly exaggerated compared to a real display. Furthermore, note that FIGS. 1 and 2 are shown as black and white images merely for illustrative purposes, and that the anti-aliasing techniques also apply for color images.
Another distinction that can be made between anti-aliasing techniques is between super-sampling and multisampling. In super-sampling, all fragment data is generated for all samples. This means all fragment data is re-sampled and contributes to the final pixel. In multisampling, sets of samples or sub-fragments may share particular parts of the fragment information. For example, a set of samples may all have the same texture coordinates but different color values. In fact, a multisampling scheme can share everything apart from color, although a penalty is paid in terms of some degradation in quality.
A set of six example sample patterns are shown illustrated in FIG. 3. The first sample pattern in FIG. 3 is point sampling 302. Point sampling corresponds to the non anti-aliased case with one sample per display pixel, whereby the sample is taken from the center of the pixel. This is the type of pattern used to generate the non anti-aliased image shown previously in FIG. 1.
The second and third sample patterns are denoted 1×2 sample (304) and 2×1 sample (306). These sample patterns both use two samples per pixel, wherein the 1×2 sample pattern has the two samples aligned vertically and the 2×1 sample pattern has the two samples aligned horizontally. Both of these two-sample patterns give lower quality results in edges which are either tending towards the horizontal or the vertical (depending on which one is being used).
The fourth sample pattern shown in FIG. 3 is denoted 2×2 sample (308). The 2×2 sample pattern utilises four samples per pixel in a regular square. This pattern is generally accepted to give good results, although may not provide particularly good resolution in the X and Y directions. The fifth sample pattern, called the Rotated Grid Super Sample pattern (310) is an attempt to improve on this problem with the 2×2 sample pattern. This pattern again uses four samples per pixel, but the samples are rotated with respect to the center of each pixel when compared to the 2×2 sample pattern.
The sixth sample pattern is a 4×4 checker pattern (312). This pattern uses eight samples per pixel. This type of pattern is typically only used where performance issues are far outweighed by the need for quality (such as for computer aided design (CAD) or design applications).
All of the sample patterns shown in FIG. 3 use a constant down-sampling weight for each sample (i.e. wi=1/n in Equation 1, above). For example, for the 2×2 sample pattern (308) a weighting value of 0.25 is used. This effectively means a box filter is being used for the down-sampling.
The main disadvantage of all the patterns in FIG. 3 is that, in order to achieve a reasonable level of quality in the results, several samples per pixel are required. All of these samples need to be processed through the graphics pipeline from the rasterizer down. Having to process, for example, four samples per pixel results in a four-fold increase in memory bandwidth requirements, power, etc. Equally, this results in a reduction in performance by a similar factor.
In order to address the problem of having to process large numbers of samples per pixel, sample patterns have been proposed that share samples between neighbouring pixels. This means that fewer samples must be processed by the fragment pipeline but a reasonable number of samples still contribute to each final display pixel.
An example of two shared sample patterns are shown in FIG. 4. Both of these patterns use, on average, two samples per pixel. The “Quincunx” pattern 402 uses a slightly different paradigm to all of the other patterns discussed here in its use of weightings in the down-sampling. Instead of a constant value for each sample, the center value is given a weighting of ½, and each of the four corner samples ⅛. In sampling theory parlance this is a “tent filter”, which is an attempt to use a more accurate model of the ideal low-pass filter: the sinc filter.
The “Flipquad” pattern 404 uses a constant weighting, as with the patterns shown in FIG. 3. Effectively it is the RGSS pattern 310 except with the samples pushed out to the pixel edges to allow sharing of samples. The pattern alternates between adjacent pixels, which is intended to give better results for horizontal, vertical and 45° edges. It is reputed that the human visual system is more sensitive to quality issues on these kinds of edges.
Both of the patterns in FIG. 4 aim to provide quality on a par with those produced using 2×2 or RGSS patterns (308, 310), whilst requiring a lower number of samples per pixel, and hence less computation.