1. Field of the Invention
This invention generally relates to the field of computer graphics systems and more particularly to the generation and processing of textures in computerized graphical images.
2. Description of Related Art
To achieve realism in a three-dimensional pixel-based computer-generated graphic image, realistic surface shading must be added to the graphic image in order to simulate real-world surface features such as surface texture. A conventional manner for generating such surface characteristics is texture mapping.
Texture mapping is a relatively efficient technique for creating the appearance of a complex image without the tedium and excessive computational costs of having to directly render three-dimensional details on the surface of a three-dimensional graphic image. In particular, a source (texture) image, which is comprised of individual elements (texels), is mapped through pixel composition onto the three-dimensional graphic image. For example, to generate a graphic image of an oak table, first the structure of the chair is created. Second, a texture image of oak wood is mapped over the structural surface of the chair. To correctly integrate this oak texture image onto the chair, a predefined number of texels are interpolated to generate each pixel of the surface of the chair.
As is well known in the art, two of the more common texture mapping techniques to effectively interpolate pixels from a texture image are bi-linear filtering and tri-linear filtering. In bi-linear filtering, four texels are used to interpolate each pixel of the graphic image. In tri-linear filtering, eight texels are used to interpolate each pixel of the graphic image.
To avoid potentially overwhelming bandwidth requirements, which are attributable to the use of these texture mapping techniques, mip (multum in parvo) mapping often is used to compress the texture image into texture maps of different `d` levels of resolution. During the scan conversion phase of rasterization, each of these mip maps of the texture map is precomputed with a specific `d` level of detail and is separately stored. Depending upon the level of detail of the pixel-based graphic image, a specific `d` level of the texture map is retrieved for texture filtering. For example, if the graphic image is displayed as a smaller image in the background of a computer-generated graphical scene, the level of detail is low, thereby allowing a lower resolution (e.g., d=4) texture map to be used. With less texels needed for the texture filtering, the lower resolution texture map results in lower system bandwidth requirements. Alternatively, if the graphic image is displayed as a full-size image in the foreground of a graphic scene, a higher resolution texture map surface (e.g., d=0) is used, thereby raising the overall system bandwidth requirements.
Unfortunately, as discussed in "Hardware Accelerated Rendering of Anti-aliasing Using a Modified A-Buffer Algorithm," Computer Graphics, 307-316, (Siggraph '97 Proceedings), which is incorporated by reference herein, in a real-time environment, where several tens of millions of pixels must be generated every second by these complex texture mapping techniques, high computational demands and tremendous bandwidths for a texture mapping system are unavoidable. For example, with a primary goal of the texture mapping system achieving an average pixel composition rate of one pixel per clock cycle, without the use of caching, one memory block (e.g. four 32-bit texels in a 2.times.2 matrix configuration) must be retrieved every clock cycle. Based upon a clock frequency of approximately 100 MHz, the corresponding bandwidth requirements for achieving such an ideal data transfer rate is approximately 1600 Mbytes per second (MBps).
Due to the ever increasing user expectations of realism in computer generated graphics as well as the demand for economical texture mapping systems, achieving such ideal data transfer rates at an economical price is often elusive. For example, commercially available systems are available that integrate a texture mapping system together with a parallel DRAM memory module sub-system design. Even though such systems offer high data transfer rates, they are economically impractical to the average consumer due to the high price of such designs.
A more inexpensive implementation of the texture mapping system design is integration of the texture mapping system into a graphic accelerator card, which interfaces through an advanced graphics port/peripheral component interconnect (AGP/PCI) interface with a relatively inexpensive general purpose, personal computer. This implementation, however, is confronted with at least two significant performance limitations.
First, due to the AGP/PCI interface only having a bandwidth of approximately 512 MBps, the ability of the texture mapping system to achieve data rates of 1600 MBps is severely curtailed. In particular, such a bandwidth bottleneck results in a relatively significant latency between the system requesting a plurality of texels (memory block) from the memory module and the system receiving back the requested memory block. Second, by issuing these memory block requests in the order that the pixel identifiers are generated by the scan module, the texture mapping system often must page-switch DRAM pages within the memory module to obtain the necessary memory blocks, thereby injecting additional latency into the overall texture mapping system.
To address such performance issues, cost-sensitive texture mapping systems employ a cache module, which is internally located within the system to attempt to minimize the limitations of the AGP/PCI. By retrieving memory blocks prior to needing any of the texels contained within the memory block, the latency attributable to retrieving these texels through the AGP/PCI interface is reduced.
The performance advantages of this local cache design, however, are dependent upon the success (`hit`) rate of storing the needed memory blocks within the cache module. For example, when rendering certain shapes (such as long, narrow triangles) as texture primitives, poor memory block locality issues can emerge. Due to cost constraints that require the cache module to be of a limited storage size, the memory block stored within the cache module, which is associated with the beginning of the scan conversion, is no longer locally stored within the cache module by the time the end of a scan conversion span is reached. This situation results in not only an increase in the failure (miss) rate to locate a specific memory block within the cache module, but also in an increase in the need to excessively swap memory blocks into and out of the cache module. To minimize this locality problem, texture mapping systems require accurate synchronization between the memory module and the cache module without dramatically increasing the cost of the system.
What is needed is a texture mapping system and method, which can effectively address the issues of transmitting as many texel requests as early as possible, synchronizing the processing of texel requests and the corresponding texels and avoiding bandwidth limitations in a cost-effective manner.