In a graphic computer system, texture mapping (i.e., "texturing") can increase the realism of three-dimensional (3D) images generated by the system. Example textures can represent wood grains, bricks, carpets, stone walls, grass, and the like. Textures provide a more realistic rendering of the surfaces of image objects. Texturing is performed using an array of texture elements (texels) stored as a texture map in computer memories. The texture map can be synthesized, or obtained from a scanned image.
A low resolution texture map may include 64.times.64 texels, while a high resolution map may have 4096.times.4096 texels. Typically, texels are stored as words of data, the address of each word indicates a particular coordinate of the texture map. The data can represent color (RBG), and, perhaps, transparency information.
To display a graphic image including objects having textured surfaces, graphic software and hardware converts the surfaces to an array of screen coordinates associated with displayable picture elements (pixels). The pixel coordinate are used to locate corresponding texels of the texel map. The color and transparency values of the corresponding texels are merged with pixel data to determine the color and transparency final values of the displayed pixels. Texel coordinates interior to surfaces can be obtained by interpolating texel coordinates supplied at the vertices of objects.
In low quality texture mapping called "point-sampling," only one texel is used for each of the pixels of the image. As a result, point-sampled textured images tend to have annoying aliasing artifacts in the way of discernable discontinuities in their textured surfaces. This is particularly true if the surface to be textured is highly distorted, for example, a surface of a 3D image which extends into the distance.
For high quality texturing, such as tri-linear texture mapping, multiple texel maps may be used, e.g., "Multem In Parvo" (many in place) maps, or Mipmaps. For example, a first high resolution Mipmap of a texture has 1024.times.1024 texels, the next lesser resolution Mipmap has 512.times.512 texels, the next 256.times.256 texels, and so forth, all the way down to a 1.times.1 low resolution single texel Mipmap for a total of 11 Mipmaps representing a specific texture. From these multiple maps, textures for distorted surfaces can smoothly be interpolated, even if zooming is used to increase or decrease the size of the object to give a sense of three-dimensions.
High quality texture mapping may require the mapping of eight or sixteen texels to a single pixel. This means that for every pixel the system must access texel data at eight or sixteen memory addresses. Clearly, texturing can consume a large amount of bandwidth of memory systems.
It would be desirable to reduce the memory bandwidth requirements of texture mapping. This might allow the use of fewer memory chips devoted to textures, the use of less expensive general-purpose low-speed dynamic random access memories (DRAM) to store textures, and to store textures in the same memory as used for other data during image generation.
Some prior art texture mapping devices store texture maps in dedicated high-speed static random access memories (SRAM). Each read request for texel data makes an access to the SRAM, even if successive data are read from the identical address. SRAMs specifically designed for texture mapping tend to be expensive, highly integrated into the graphic hardware, and of limited functionality.
In modern DRAM, the sense amplifiers can be used to "cache" data. Caching can take advantage of spatial and temporal localities of data. For example, if a sequence of texel addresses are all in the same DRAM page, then the data can be accessed directly from the sense amplifiers.
With DRAM, the memory bandwidth for fetches from the same memory page can approach that of SRAM. However, if there is a "miss" on an address of a current page, then another page needs to be accessed. Switching between DRAM pages may require several processor cycles while the data of the next page are fetched and latched into the sense amplifiers. This increases access latencies. Such latencies can be hidden by using long pipelines in the access path if the average bandwidth of the memory system is sufficient to handle page fetches.
Adding a true cache to the memory system may reduce bandwidth requirements when there are good spatial and temporal localities in the data. However, implementing a cache memory for graphic devices is difficult. If the cache is configured as a traditional blocking cache, then a miss will "stall" further accesses, since the miss must completely be serviced before a next access request can be accepted. This is due to the fact that the fetched data must be latched somewhere before further requests can proceed. If there are a larger number of misses, then stalls will cause the memory system to deliver less bandwidth for texel fetches than a pipelined non-cached memory system.
If miss-servicing bookkeeping logic is included, then the cache can be made non-blocking. However, a non-blocking cache can be a hindrance if it cannot track as many misses as there are stages in the pipeline, e.g., from the read request to the data becoming available.
The latency and cache size problems become even worse when the memories not only store texel and pixel data but also other information. If the memories are configured, as desired, from general purpose low-cost DRAM, then many different types of graphic information can be stored there. However, in this case, access requests to the various buffers should be batched to avoided page "thrashing." Batched accesses may delay requests for texel data to further increase latencies and the number of misses that need to be tracked.
A direct-mapped cache could be used. However, it should be quite large so that addresses are adequately distributed throughout the cache. If the cache is tens of lines, then some data may remain unused for extended periods of time wasting cache, while other data experience frequent conflicts degrading performance. Larger caches also increase cost.
A non-blocking fully associative content addressable memory (CAM), which uses the full address of the data as a tag, may be more appropriate for texture mapping. But even a fully associative cache needs to be a reasonable size, and latencies on a miss can still cause large delays between the time the read request is issued, and the time the data become available. The peculiarities of texture mapping increases the likelihood of cache misses. In a worse case, each texel is used exactly once, and therefore, each access will cause a miss, so the cache does not provided any benefit at all.
Therefore, it is desired to provide a cache for graphic systems which can decrease the bandwidth required by texel fetches without the disadvantages of traditional caches.