When attempting to increase performance for Graphics Processing Units (GPUs), one important method is applying various techniques to reduce memory bandwidth consumption, i.e. the bandwidth required between the memory and the GPU. The importance of bandwidth reduction is also becoming increasingly important as the performance growth rate for processing power is much larger than performance growth for bandwidth and latency for Random Access Memories (RAMs).
Although it is sometimes possible to trade computations for memory accesses, for example by computing the value of functions rather than accessing pre-computed lookup-tables, it is likely that at some point the computation needs are satisfied, leaving the GPU idly waiting for memory access requests. Additionally, a brute force approach of simply duplicating memory banks and increasing the number of pins on memory chips may not be feasible in the long run. Finally, transferring data between the GPU and the RAM consumes large amounts of power, which is a problem, especially in mobile applications. Because of that memory bandwidth reduction algorithms are an important area of future research.
One type of images used in graphics applications is referred to as textures. A texture is just a regular image that is used to represent the surface of a graphics primitive such as a triangle, or a quadrilateral (quad). Since a texture is a type of image, it consists of pixels. However, since the final rendered image also consists of pixels, it is common to use the name “texel” for an image element or texture element in the texture. In order to draw a pixel in the rendered image, one must first work out where in the texture this corresponds to. Often however, the pixel will not correspond exactly to a texel in the texture, but will fall somewhere in between four texels. Then, bilinear filtering is usually done between these four texels to produce the pixel. This is referred to in the literature as texture mapping using bilinear filtering. Another complication is that the resolution of the rendered image may be different from the resolution of the texture. For instance, a portion of a rendered image may occupy 100×100 pixels whereas the matching texture may have the size of 512×512 texels. Rendering from such a big texture may produce antialiasing artifacts. Therefore, a preprocessing step is typically averaging and subsampling the texture to a set of resolutions, for instance 256×256, 128×128, 64×64, 32×32 16×16, 8×8, 4×4, 2×2 and 1×1 texels. These levels are often called mipmap levels, with the 512×512-version being the highest resolution mipmap level and the 1×1-version being the lowest resolution mipmap level. The two closest mipmap levels are then used. If it is again assumed that the portion in the rendered image occupies 100×100 pixels, the GPU will use the mipmap levels of 128×128 texels and 64×64 texels. In each of these, the GPU will calculate the nearest four texels, combine them bilinearly, and then combine the two resulting values linearly. This is referred to in the literature as trilinear mipmapping. As should be clear from the above, this means that up to eight texels may have to be processed in order to produce one rendered pixel.
Texture compression is one popular way of reducing bandwidth requirements. By storing textures in compressed form in memory and transferring blocks of this compressed data over the bus, texture bandwidth is substantially reduced.
A light map is a common type of texture. A light map is a texture applied to an object to simulate the distance-related attenuation of a local light source. For instance computer games use light maps to simulate effects of local light sources, both stationary and moving.
Traditionally, light maps have been used to model slowly varying lighting behavior in an economical way. A typical example has been a textured brick wall. If only one texture is used, the texture has to be of very high resolution in order to reproduce the details of individual bricks. To avoid big textures an obvious trick is to repeat the brick pattern, which means that the brick texture can be small and still of high resolution with respect to the screen resolution. The drawback is that the brick wall then has to be exactly the same—a brick in the top left part of the wall must be represented by the same texels as a brick in the lower right corner. The lighting will thus be uniform across the entire wall, which often looks unrealistic.
Light maps were created to get around this problem. Two textures were used: one small, repeated texture of high resolution, and one small, non-repeated of lower resolution. The final texel with which to color the rendered pixel can then be calculated using a formula such as:final_color(x,y)=brick_texture(x+i×N,y+j×M)×lightmap_texture_(x/S,y/T)where the brick_texture is the repeated texture and lightmap_texture is a low-resolution texture. Here i is selected so that x+i×N is never bigger than N−1 and never smaller than 0 (sometimes by using negatively valued i). Likewise, j is selected so that y+j×M is between 0 and M−1. This way, both brick_texture and lightmap_texture could be small and thus require little bandwidth during rendering. The reason this works is due to the fact that the changes in lighting of the light maps usually are rather slow, and lowering the resolution is therefore acceptable.
Early light maps were scalar valued, i.e. they contained only an intensity value in each texel that decreased the intensity of the other texture. Soon came colored light maps, where each texel contained an RGB (Red-Green-Blue)-tuple, and so were able to simulate colored light.
Recent developments have increased the photo-realism of light maps by describing the incoming light in three different directions. Hence, instead of just storing a single RGB-tuple in the texel, which describes the average (colored) lighting hitting a particular point on the texture, three RGB-tuples are stored. Each RGB-tuple now describes the light that shines on a particular point from a particular direction. Together with a normal-map, which describes the normal in the particular point, the fragment shader can then calculate which of these three light directions are most relevant and compute the fragment (pixel) color accordingly. With an additional trick, even texture self-shadowing is possible.
Whereas these recent developments increase photorealism, they also demand three times the storage and bandwidth, since three RGB-tuples are stored instead of one. Moreover, many applications demand high dynamic range data (floats or halfs instead of integers), further increasing the burden of bandwidth and storage. Therefore, light maps are increasingly compressed using texture compression methods, such as DXT1 as disclosed in U.S. Pat. No. 5,956,431. DXT1 is a texture compression method which converts a 4×4 block of pixels to 64-bits, resulting in a compression ratio of 6:1 with 24-bit RGB input data. DXT1 is a lossy compression algorithm, resulting in image quality degradation, an effect which is minimized by the ability to increase texture resolutions while maintaining the same memory requirements (or even lowering it).
While DXT1-compression usually gives quite good image quality for regular textures, there are cases where DXT1 does not do a very good job. The most important one is perhaps slow transitions between two colors. Since DXT1 cannot have more than four different colors per 4×4 block, it is impossible to create very smooth ramps between colors. The result will be “grainy” or “dirty”-looking transitions.
There is therefore still a need for texture compression and decompression systems and in particular such systems that are adapted for handling light maps and textures, where DXT1 is not advantageous.