The method most commonly used by commercial systems to generate textured and shaded real-time 3-D images uses a Z-buffer system. This is one of simplest visible-surface algorithms to implement in either software or hardware and is used by such companies as Silicon Graphics, Evens & Sutherland, and Hewlett Packard.
It requires that we have available not only a frame buffer in which colour values are stored, but also a Z-buffer, with the same number of entries, in which a z-value is stored for each pixel. Polygons, normally triangles, are rendered into the frame buffer in arbitrary order. During scan-conversion, if the polygon point is not further from the viewer than the point already in the buffer, then the new point's textured and shaded colour is evaluated and the z-value replaces the old value. No pre-sorting is necessary and no object-object comparisons are required.
The term "Texture Mapping" refers to the process of transforming a 2D array of pixels onto another 2D array of pixels. The transformation can be completely arbitrary, but in this document we will consider perspective mapping. Perspective mapping effectively takes a 2D array of pixels, rotates and translates them into 3D and then projects them onto a z=n plane for display.
Texture mapping is used in computer graphics to try to mimic the surface detail of real objects. Using perspective mapping, it is possible to place a picture (e.g. the surface of wood onto a 3D object (e.g. a table). The result being that the table has the appearance of wood.
The equation for perspective mapping is given below. ##EQU1##
`u` and `v` are the 2D coordinates in texture space. PA1 `x` and `y` are the 2D coordinates in screen space. PA1 `a`, `b`, `c`, `d`, `e`, and `f` are the coefficients used in the mapping process. PA1 when y is small, then the following approximation will be used to raise x to a power n, when x is in the above range. EQU x.sup.n.apprxeq.(1-max(1, (1-x).2.sup.k)).sup.2
Sampling a signal that has frequencies above half the sample rate causes aliasing. Texture mapping is a re-sampling process which can easily cause aliasing. To produce a texture mapped image that does not alias, one has to low pass filter the image in texture space to half the sample rate of the mapping. A complication to the situation is that the sample rate may vary depending on the position in screen space.
To correctly filter the image, one requires a large amount of processing. There are many ways to approximate the correct filtering with mathematically simpler operations. The simplest method is MIP Mapping where MIP stands for MULTIM IMPARVO
MIP Mapping requires a pre-processing stage where a texture map stored in memory is filtered and decimated to half the resolution. This is repeated until the resulting image is 1 pixel in size (this does assume that the texture is square and a power of 2). FIGS. 1(a) through 1(q) show an example of a brick texture at 128.times.128 resolution with the associated lower MIP Map levels.
A MIP Map can be thought of as a pyramid of images. The MIP Map is accessed through 3 variables `u`, `v` and `D`. The variables `u` and `v` are the coordinates of the pixel in texture space that is required to be anti-aliased. The variable `D` is a measure of how filtered that pixel is required, i.e. how high up the pyramid. The value D=1 means that the full resolution image is used. The value D=2 means the half resolution image is used, and so on. When values of `D` are not powers of 2, a blend of the two nearest mip-map levels are calculated using linear interpolation in a well know manner.
Ideally, `D` is the square root of the area that the screen pixel covers when it is transformed into texture space. Unfortunately, this is extremely expensive to calculate on a pixel by pixel basis.
P. S. Heckbert (Comm ACM 18 (6) June 1975) suggested using the following approximation for `D`: ##EQU2##
Although this method is far simpler than the ideal, it still requires a square root, and two additional mappings per pixel.
Shading is a term used in computer graphics which refers to part of the process of evaluating the colour of a surface at a particular screen position. When a surface is shaded, the position, orientation, and characteristics of the surface are used in the modelling of how a light source interacts with that surface.
The realism of computer generated images depends considerably on the quality of this illumination model. Unfortunately an accurate lighting model is too expensive to use in a real time rendering system. It is therefore necessary to compromise and use approximations in the lighting calculations.
The most common form of approximation is diffuse or lambert shading, which simulates the light reflected from matte components of materials. This is illustrated in FIG. 2 and the following equation: ##EQU3##
FIG. 2 illustrates a light ray emanating from the eye or camera, (which passes through a particular screen pixel) and strikes a surface at a given point, P. It also shows the vector, the normal, which is perpendicular to the surface at P, and a ray from a light to P.
The equation states that the intensity (or colour), I, of the diffuse illumination at P is the product of the light's intensity, L, the diffuse reflectivity of the surface, S, and the cosine of the angle between the light ray and the normal .theta..
This calculation can either be assumed constant for an entire surface, which is known as flat shading, or can be interpolated in various ways across polygonal facets to stimulate curved surfaces. Couraud shading (Comm ACM 18(60) pp. 311-17, 1971) for example, is the linear interpolation of the diffuse colours across the polygonal facets.
An alternate manner of interpolation is to linearly interpolate the surface normals, in the above equation. This method was proposed by Bui Thuong Phong. (Comm ACM 18(6) June 1975). Unfortunately, this computation is relatively expensive. An effective approximation has been proposed by Bishop and Weimer (Computer Graphics 20(4) pp. 103-6, 1975) which uses 2 dimensional Taylor series approximations. These approximations can be evaluated efficiently using differences equations. Apart from a small precalculation overhead, the cos .theta., can be approximated at the cost of only 2 addition per pixel.
The next most common lighting feature that is simulated is glossy or specular highlights. These are the reflections of the lights in a scene. Specular highlights give surfaces a reflective appearance, and greatly improve the realism of images. They are most commonly approximated by a method again described by Bui Tuong Phong, which is illustrated in FIG. 3 and in the following equation. ##EQU4##
FIG. 3, illustrates a `ray` emanating from the eye or camera, which passes through a particular pixel, and strikes a surface at a given point, P. It is then reflected in the shown direction using the usual mirror angles. Note that because surfaces are not perfectly smooth, the reflected ray is assumed to spread out to some degree, governed by the roughness of the surface. Finally, FIG. 3 also shows a ray coming from a light which will appear as a highlight. The closer the reflected direction is to the light, the brighter the highlight will appear.
The equation describes the intensity of the highlight, I, as a function of the intensity of the Light, L, the reflectivity of the surface, S, the angle between the reflected direction and the light, .theta., and the smoothness of the surface, n. This function means that as the reflected direction approaches the light direction, (i.e. .theta. gets small), the intensity of the highlight reaches a maximum, and falls off in other areas. The larger the value of n, the faster the fall off, the smaller the highlight and hence the smoother the surface appears. To be most effective this calculation needs to be referred to as the power function. Typical values of n range from 1 to 1000.
Bishop and Weimer again propose computing part of the above equation using 2 dimensional Taylor series approximations (see diffuse shading), and then using a look-up table to calculate the power function.
DRAM is an acronym for Dynamic Random Access Memory. The structure of `page mode` DRAM can be broken down into banks, pages and locations. A location is the atomic addressable memory element. A number of contiguous locations make up a page, and a number of pages make up a bank.
When a device requests the contents of a location, the page that contains that location is opened and the data is fetched. If the next location requested falls within the same page, the page does not have to be re-opened before fetching the data. As the time taken to open a page is significant, it is much faster to request random locations within a page than to request random locations outside a page. The term `page break` refers to fetching the contents of a location from within a page that is not open.
In a memory configuration that has multiple banks, it is possible to keep more than one page open at a time. The restriction is that the open pages have to be in different banks.
SRAM is an acronym for Static Random Access Memory. The structure of SRAM is simpler than that of DRAM. Any location within the device can be fetched within a single clock cycle. The disadvantage of this simplicity is that the density of memory elements on the Silicon is much less than that of DRAM. As a result, SRAM is considerably more expensive than DRAM.
The atmospheric effect of fogging adds a dimension of realism to a generated 3D scene. It is not just useful for scenes that look foggy, all out door scenes require fog. In an out door scene, objects in the distance have less vibrant colours. This effect is the same as fogging, the objects `grey out` with distance.
If one assumes that the fog does not vary in density, then light is attenuated as a percentage per meter. For example, if the light is attenuated by 1% per meter, in 1 meter the light will be 0.99 of what it was. In 2 meters, the light will be 0.99*0.99=0.99.sup.2 =0.98 of the original. In `n` meters, the light will be 0.99.sup.n of the original.
A problem with texturing and shading in conjunction with a Z-buffering system is that each screen pixel may have it's colour evaluated many times. This is because the surfaces are processed at random and the closest surface to the eye may change many times during the processing of the scene.
A deferred texturing and shading architecture embodying one aspect of the present invention eliminates the wasted calculation by only evaluating a single colour for each screen pixel. Each surface within the scene has a unique `tag` associated with it. To process a pixel, the hidden surface removal is performed which results in the tag of the closest surface to that pixel being calculated. This tag value is used to fetch the instructions that will correctly texture and shade that pixel.
The embodiment described in this document, provides a highly efficient and optimised architecture which particularly lends itself to systems that require deferred texturing and shading. For a complex system, such as the one described here, it is very important to ensure that the best balance between complexity (hardware size/Silicon area) and functionality is carefully achieved. This means that the overall system architecture, including interactions between major blocks, as well as the detail of low level methods used in each block, must be carefully designed as selected. Of particular importance are:
1. Data organisation and memory architectures that ensure minimisation of bottlenecks; PA0 2. Partitioning of resources in terms of various functional blocks, memory cache requirements, and their interaction; PA0 3. Complexity levels in terms of numerical precision, and the extent of approximations used in the method used to implement the functions required in each block; PA0 4. Support for adequate levels of flexibility within hardware/Silicon budget.
The embodiment described here uses careful algorithmic and simulation analysis to strike a balance across the above issues.
One of the rate determining steps of a system that performs MIP-mapping, is fetching the texture pixels from memory. To keep the bandwidth high, some designs use fast SRAM. Although this can deliver the desired performance, the cost of such a solution is prohibitive.
The proposed architecture in a preferred embodiment of the invention stores the MIP-MAP data in page mode DRAM. To obtain the highest data throughout, the number of page breaks have to be reduced to a minimum. The simplest method of storing a MIP-Map would be to arrange the D levels contiguously, and within the maps, the pixels stored in scan order. Because of the nature of access to the MIP-Maps, this method of storage would break page on almost every access. A memory structure that optimises the number of page breaks.
The coefficients of the texturing equation described above are potentially unbounded. Even when sensible bounds are enforced, there are a large range of values required. Implementing the equation using purely fixed point arithmetic would need high precision multipliers and dividers. Such hardware would be large and slow, thus impacting on the price and performance of the system. A mixed floating and fixed point method embodying another aspect of the invention which has two advantages is described. Firstly, the size of the multipliers and dividers can be reduced, and secondly, the complexity of implementing full floating point arithmetic is avoided.
The texture mapping equation requires one division per pixel. Most hardware division architectures use some successive approximation method which requires iterating to the desired accuracy. This interation has to be performed in time (multiple clock cycles), or in space (Silicon area). A new method and architecture is described embodying a further aspect of the invention for evaluating a reciprocal to the desired accuracy without any successive approximation.
The variable `D` is a measure of how filtered a textured pixel is required. Heckbert's approximation to `D` is quite computationally expensive, requiring a square root, and two additional mappings per pixel. A new method of calculating `D` is described which is simpler.
Fetching the parameters used in the texturing and shading process impacts on the rate that the pixels can be processed. The order in which the parameters are required are not completely random, and a locality of reference can be exploited. The architecture described includes a parameter cache which significantly improves the performance of the overall design.
To control the intensity of size of a specular highlight, a power function is generally used. This can be expensive to implement--either an approximation through a large ROM lookup table, or explicit implementation of an accurate power function are two possibilities. Since the highlight equation is just an approximation, the function does not have to be a perfectly accurate power function.
Assuming a method of calculating, or approximating cos .theta. exists, then the power function is merely the computation. EQU x.sup.n where 0.ltoreq.x.ltoreq.1
It is also not necessary to have fine control over the granularity of n, a few integer values will suffice. Noting that EQU (1-y).sup.2n.apprxeq.(1-2y).sup.2n-1
where k is the integer part of (log.sub.2 n)-1. The value k effectively replaces the value n, as part of the surface properties.
Computation of two substraction's, is relatively inexpensive. Multiplying a value by 2.sup.k in binary notation is merely a shift left by k places, and the Max. computation is also a trivial operation. The square operation is then a multiplication of a value of itself.
The method is applicable to both hardware and firmware implementations.
To calculate the attenuation factor when fogging a scene requires the density of the fog to be raised to the power of the distance between the object and the viewer. An efficient method of approximating this function is described.