3D graphic applications such as virtual reality, 3D games or in the more professional sphere modelling and animation nowadays already form part of the standard applications on PCs. A prerequisite for real time capability in that sector is the extreme increase in power of the processors in recent years and more recently the use of 3D graphic accelerators which take over special recurrent tasks in graphics generation. The processor is only still required to perform the task of generating the geometry description of the scene to be represented, everything else such as rasterization (generation of the pixels to be displayed) and shading (coloration of the pixels) being effected by the accelerator.
In consideration of the fact that such systems are however still restricted in capacity, compromises have to be accepted between image quality and the real time requirement (at least 25 images per second for a continuous motion). In general terms, more value is put on a jerk-free representation whereby on the one hand objects can be only very roughly modelled in order to keep down the number of polygons, and on the other hand display screen resolution is kept low in order to limit the number of pixels to be generated. With the nowadays conventional VESA-resolution (640×480 pixels) and with animated image sequences, aliasing effects which occur due to rasterization however are still particularly troublesomely evident. Nonetheless they are tolerated as conventional anti-aliasing attempts such as for example supersampling involve an excessively high level of memory and computing power demand.
Until a scene which is modelled in the computer can be represented on the display screen, a number of steps are required:
1. The data sets stored in the memory in respect of the objects to be represented must be transformed (scaled, rotated) and placed at the correct locations in the virtual scene (modelling transformation).
2. Starting from the position of the objects with respect to the angle of view of the viewer, objects which cannot be guaranteed visible are rejected and thus no longer taken into consideration. In that respect both entire objects which are outside the visible volume are eliminated (clipping) and also individual polygons of objects which face away from the viewer (backface removal).
3. The polygons modelled in world coordinates (mostly triangles) must now be converted into the image coordinate system, in which respect perspective distortion is effected in order to permit imaging which is as realistic as possible (viewing transformation).
4. The polygons which are now present in the image coordinates must be prepared in such a way that they can be processed by the renderer (for example calculation of gradients at edges etc./setup processing).
5. Then, in the rasterizer calculation of the visible pixels on the display screen is then implemented. For each pixel, not only is the position on the display screen (x, y) calculated, but the procedure also involves determining further parameters which are necessary for illumination and masking analysis (z-value, homogenous parameter, texture coordinates, normals, etc./rasterization).
6. On the basis of the calculated parameters, the color values or chromaticity values of the pixels to be represented are now ascertained (lighting). That step is here implemented only when what is involved is a Phong renderer. If there is only a Gouraud renderer, that step is already implemented prior to transformation into the image coordinate system.
7. The calculated chromaticity values are then stored in the frame buffer when the z-value of the pixel specifies that the pixel is before the pixel at that position in the frame buffer (z-buffering). Prior to the storage operation the chromaticity values can be modified by means of blending with the value previously in the frame buffer, whereby for example modelling of semi-transparent objects becomes possible.
8. When all visible triangles have been rasterized the image to be represented on the screen is in the frame buffer. By way of the RAMDAC the image is linearly read out of the memory and converted into analog signals which are passed directly to the monitor (display process).
The problem of aliasing arises out of the nowadays usual use of raster displays as it is not possible with discrete dots or points for example precisely to display an inclined edge. In “normal” scan conversion a pixel (picture element) is set whenever the pixel center point is covered whereby visible jumps occur at certain spacings from a continuous edge in the discrete case, and such jumps are particularly apparent in the case of moving images as they move along the edge. Thus for example upon movement of an almost horizontal edge, the irritating effect occurs that, if the edge is slowly vertically displaced, the jumps move rapidly horizontally along the edge. It seems therefore that neither the direction of motion nor the speed of motion are right.
It is known from signal processing that a signal (in this case the image) can be correctly reproduced only when the sample rate is greater than double the maximum frequency that occurs (Shannon's sample theorem). As however the sample rate is fixedly predetermined by the screen resolution, as a corollary only frequencies beneath half the sample rate are correctly reproduced; all frequencies thereabove contribute to aliasing.
The problem of conventional sampling therefore involves the fact that a pixel is always considered as a point, whereby the name point sampling is also used. What is common to all anti-aliasing endeavours is that the pixel is now seen as something that is two-dimensional or areal; the color should therefore arise out of averaging of the coloration of the covered pixel area. Anti-aliasing now seeks to eliminate as far as possible or at least attenuate the problems arising out of representation.
In the case of aliasing at polygon edges, not all polygon edges are involved, but only those which are to be found at the edges of objects. Within objects, the polygons generally merge seamlessly together and if adjacent polygons are of similar coloring, it is of no importance at the edges between the polygons whether the pixels of the one polygon or the other are set.
Very small objects can disappear if the extent thereof is smaller than a pixel. This effect is particularly striking when due to a small displacement the object becomes visible as suddenly a pixel center point is affected. There is a kind of blinking effect which guides the attention of the viewer to it.
In the case of modern Phong renderers, additional aliasing effects occur at the spot limits. In the case of Gouraud renderers that problem does not occur as with such renderers no light effects like those shown herein are possible, whereby image quality is not comparable.
The most wide-spread approach to anti-aliasing is supersampling. Each pixel is subdivided into (n×n) subpixels which then again are normally “point-sampled”. The result is an intermediate image which in both dimensions has n-times the resolution of the image to be represented. Summing of the chromaticity values of the (n×n) subpixels and subsequent division by the number of subpixels (n2) then gives the final color of the pixel. From the point of view of signal processing the sample rate is increased by the factor n (also referred to as the oversampling factor), whereby smaller details can be reconstructed. Meaningful values for n are in the range of between 2 and 8 whereby between 4 and 64 chromaticity values per pixel are available.
In spite of the simplicity of this procedure there are various disadvantages, in consideration of which it was not possible to attain implementation in hardware:
1. Memory requirement: As the image is rendered in an n-times resolution, not only the frame buffer but also the z-buffer (24–32 bit/pixel) must be of an n-times configuration.
For example, for a screen resolution of 1024×768 pixels and an oversampling factor of n=4, there is a memory requirement of 48 MB for the frame buffer and 36 MB once more for the z-buffer. In total therefore 84 MByte memories are required in comparison with 5.25 MB for normal sampling.
2. Computing time: Because of the larger number of pixels to be produced the computing time also increases by the factor n2. If therefore the system were previously capable of displaying 16 image per second, with n=4 it now only involves one image per second; it has therefore lost its real time capacity.
In addition the procedure cannot guarantee that the image produced is free from artefacts for at each sampling rate an image can be easily constructed, which is guaranteed wrongly represented. If the image to be represented has a horizontal resolution of w, a perpendicular stripe pattern comprising (n×w) black and (n×w) white bars is represented either completely black or completely white.
In the case of stochastic supersampling the sample points are randomly distributed over the pixel whereby the artefacts which remain are superimposed with a noise which is more agreeable to the human eye.
An image which was rendered four times with a stochastic approach is of approximately the same image quality as a 16-times supersampling on a regular raster.
The process however is limited to a use in software terms as hardware renderers operate exclusively with incremental processes. With sample points which are positioned randomly there is no longer a fixed sequence of the points so that processing can no longer be implemented incrementally but the parameters per point would have to be completely newly calculated, which involves an extreme amount of time.
Not all subpixels have to be treated in the same way in the post-processing pass; when summing the chromaticity values, it is also possible to introduce a factor which specifies how important the subpixel is for the pixel. The factors are ascertained in accordance with Gaussian, Poisson or another distribution in which generally subpixels which are closer to the pixel center point acquire higher weight.
Supersampling is always effected in its basic form with the same sampling rate over the entire image. In areal regions however a great deal of computing time is required as there each subpixel also controls the same chromaticity value. The idea in that respect is of implementing supersampling only where it is truly necessary. In a first pass the image is normally rendered and in a second pass each chromaticity value is then compared to those from its surroundings, and supersampling in respect of that pixel is now effected only if the difference exceeds a predetermined threshold value. It will be appreciated that the disadvantage with that process is the double rendering of each polygon.
The accumulation buffer is a modification of the above-mentioned supersampling procedure, in which the extreme memory requirement is avoided. The procedure only requires in addition to the frame and z-buffers which are present in any case an additional buffer of the size of the frame buffer, which however needs to be of a somewhat higher level of accuracy. Rendering of an image now requires n2 rendering passes with the normal frame buffer resolution, with n again representing the oversampling factor. Between the calculations of the partial images, the procedure provides for respectively displacing the coordinate system of the geometry to be described in the subpixel range in such a way that the pixel center points each come to lie on another sample of the corresponding supersampling process. The frame buffer contents which are produced in each rendering pass are accumulated in the additional buffer (hence the name accumulation buffer), whereupon erasure of the frame and the z-buffers occurs for the next partial image. As n2 chromaticity values per pixel are summed in the accumulation buffer 2*log2n accuracy bits are once again additionally required for frame buffer accuracy so that no overflow of the chromaticity values is possible. As soon as all partial images have been rendered, the chromaticity values from the accumulation buffer are divided by the number of samples (n2) and taken over into the frame buffer which can then be displayed.
The use of the accumulation buffer instead of the frame buffer which is enormous in the supersampling procedure eliminates the disadvantage of the large memory requirement, but not the requirement in terms of computing time. In contrast the rendering time will be even further increased as now the geometry descriptions (mostly triangles) have to be transmitted a plurality of times to the renderer.
Area sampling, developed by Edwin Catmull [Edwin Catmull: “A hidden-surface algorithm with anti-aliasing”, Aug. 78], is based on calculating for each pixel the area which is allotted to the individual polygons. That is effected by an analytical procedure so that each polygon, no matter how small, is considered, and furnishes a correspondingly high level of image quality.
The procedure of the algorithm can be described approximately in the following terms:
All polygons are arranged in accordance with their greatest y-coordinate.
A list of active polygons is managed, in which polygons are entered as soon as the scanline with the greatest y-value is reached, and from which they are erased again as soon as the value falls below the minimum y-value.
Per scanline, in which respect a scanline is considered in this case as something two-dimensional, for each pixel a bucket must be applied, in which all respective polygon portions which contribute to that pixel are entered. In that respect the polygon proportions are again represented by polygons which have been clipped in relation to the pixel edges. When constructing the buckets, care is taken to ensure that the respective polygons are sorted therein, in accordance with their z-values.
Per pixel, by means of the so-called “hidden-surface-algorithm” from Sutherland the visible area of the individual polygons is determined, the colors of which in weighted form then give the definitive pixel color.
Aspects which tell against a hardware implementation of that kind are the extreme computing expenditure for determining the visible area proportions or components in the case of lists of increasing length, and the scanline-based procedure in regard to rasterization.
The A-Buffer-Algorithm [Loren Carpenter: “The a-buffer, an anti-aliased hidden surface method”, Computer Graphics 12, 84] represents discrete implementation of area sampling, in which it is not the exact area of the polygons that is stored but also subpixel masks which represent an approximation of the areas. In addition traversing is effected in a polygon-wise manner so that once again a frame buffer is required, which however must accommodate a dynamic list per pixel, in which the subpixel masks are stored together with their chromaticity values. In that case, for each subpixel mask, the procedure involves storing either only one z-value (that of the center point of the pixel) or two z-values (minimum and maximum z-values occurring in the pixel), in order to limit the memory expenditure. As soon as all polygons have been processed, in a further pass through the image the subpixel masks are calculated on the basis of the z-values so that the definitive chromaticity value is produced.
The constantly high memory requirement of supersampling was thus replaced by a memory requirement which dynamically adapts to the complexity of the scene. On the basis of that dynamic behavior and the resulting transversing of lists, this method is scarcely suitable for hardware implementation. Furthermore concealment analysis is found to cause problems because of the limited number of the z-values. In [Andreas Schilling and Wolfgang Strasser: “EXACT: Algorithm and Hardware Architecture for an Improved A-Buffer”, Computer Graphics Proceedings, 93] a possible solution is set forth, by means of the additional storage of dz/dx and dz/dy on a subpixel level.
The Approximationbuffer from Lau [Wing Hung Lau: “The anti-aliased approximation buffer”, Aug. 94] also involves using the A-Buffer approach, but in this case the number of fragments stored per pixel is limited to two. This therefore involves a constant memory expenditure which however is bought at the expense of losses in terms of image quality. Thus, only pixels which are covered by a maximum of two polygons are still correctly handled as more proportions cannot be represented. The method is accordingly limited to a few large polygons as then the case of more than two fragments practically only very rarely occurs (less than 0.8% accordingly to Lau) and thus the quality of the resulting images is sufficiently good.
The Exact Area Subpixel Algorithm (EASA by Andreas Schilling [Andreas Schilling: “Eine Prozessor-Pipeline zur Anwendung in der graphischen Datenverarbeitung”, Dissertation to the Eberhard-Karls University in Tubingen, June 94]) represents a modification of the a-buffer, in which a higher level of accuracy is achieved at edges of a polygon. Generation of the subpixel masks is effected in the A-buffer on the basis of concealment of the subpixel center points. Schilling in contrast calculates the exact area component, from which a number of subpixels to be set is derived, which then results in the generation of the actual mask on the basis of the gradient of the edge. This method can achieve a higher level of resolution in relation to almost horizontal (vertical) edges as a plurality of subpixels are always set all at once in the A-buffer so that it was not possible to utilise the maximum resolution of the subpixel mask.
The method of Patrick Baudisch [Patrick Baudisch: “Entwicklung und Implementierung eines effizienten, hardwarenahen Anti-aliasing-Algorithmus”, dissertation submitted for a diploma, the Institute of Science and Technology of Darmstadt, Sept. 94] is also based on subpixel masks which are generated at polygon edges. In this case however they do not serve to calculate the area of the individual polygons and thereby correspondingly to weight the color thereof as with the previous processes, but to add the adjacent colors from a normally point-sampled image. The basis adopted is spatial coherence of the pixels, that is to say the color which a polygon partially contributes to a pixel is guaranteed to be found in an adjacent pixel. The position of the subpixels in the mask specifies from which adjacent pixel admixing is to be effected if it is set. The subpixels on the diagonals refer in that case to two adjacent pixels.
In the four-vector method in each case 4 subpixels are combined together to form a meta-subpixel which in turn specifies admixing of the adjacent pixel color. Because of the combination step, the spatial information is lost but the accuracy of the area proportions is increased. The previous anti-aliasing methods all have disadvantages in terms of their possible capacity for hardware implementation, which however can only be overcome with difficulty. On the one hand an enormous memory expenditure must be involved (supersampling, area sampling, A-buffer) and on the other hand computer expenditure of some methods is so high (supersampling, accumulation buffer, area sampling) that hardware implementation is scarcely still possible in real time. In addition, in methods which operate exclusively on polygon edges (area sampling, A-buffer, approximation buffer, EASA, subpixel methods, four-vector methods), billboard and spot artefacts are not taken into account.
The problems arising in the state of the art are to be once again briefly summarised as follows:
The problem of aliasing arises out of the use of raster displays as it is not possible with discrete points for example to exactly represent an inclined edge. Normal rasterization methods give rise to jumps at given spacings from a continuous edge in the discrete case, and such jumps severely adversely affect the visual impression.
A conventional approach for dealing with that effect is supersampling in which by means of many sampling points within a pixel the attempt is made to arrive at a better color. The method however can scarcely be used in the real time level as a very high level of memory and computing time expenditure is necessary.
The other methods seek to determine a better chromaticity value by precisely calculating the contribution in terms of area of the polygons to each pixel. In this case also however a very high level of computing time is required (if not also memory space).
It is also known from U.S. Pat. No. 5,748,178 for the surroundings of a pixel to be stored in the passage in a kind of shift register, in which case mutually adjacent pixels also occupy adjacent memory places. A filtering action is achieved in that case by virtue of the fact that a respective pixel surrounding can then be subjected to a common filter weighting procedure. As the effectiveness of the method is dependent on which pixels are in randomly mutually adjacent relationship, effective anti-aliasing is not possible in this case.
It is also known from U.S. Pat. No. 5,264,838, for the purposes of anti-aliasing, to provide a respective environment of a pixel in the region of a pulse with a blurred or non-sharp environment (halo). That method however only produces an additional lack of sharpness as it acts in an undirected fashion.