1. Field of the Invention
Embodiments of the present invention relate to computer graphics processing systems in general and, in particular, to graphics processing units that predict memory fetches for texture maps for rendering three-dimensional objects.
2. Description of the Related Art
Three-dimensional (3D or 3-D) models in modern video games and computer aided design (CAD) applications use texture maps to approach a realistic appearance. Texture maps, sometimes called textures, are typically a table of color, transparency, material properties, surface orientations, or other features that can be digitally wrapped around or otherwise mapped to a 3D object. Despite the name, they not only can define the textural appearance of an object but also its color, reflective properties, material properties, and other surface detail. In video games, the textures used for a 3D model often include a diffuse color texture, a specular (shiny) color texture, a normal map, a transparency map, and material index, among others.
In computer graphics rendering, some of the highest latency operations in a graphics processing unit (GPU) are related to memory accesses. Memory read operations can take orders of magnitude longer to conduct than algebraic operations, such as adding, subtracting, multiplying, and dividing. For example, it is not uncommon for reading a value from memory to take 10-100 times the clock cycles than it takes to add two values together.
In the prior art, shaders, which despite the name are general rendering programs and not limited to shading effects, compensate for memory access latency by starting several rendering threads at once in a row. Each thread is assigned a pixel to render. The thread uses the assigned pixel to look up what object (or background) the pixel corresponds to and accesses the appropriate texture maps for texturing. Particularly, the memory locations of appropriate texels within the texture maps are read for the pixel. When a thread is finished with a pixel, another thread is called for another pixel. This goes on and on until all the pixels of the image are rendered. The number of threads are called at one time are limited by the number of registers or stack memory for tracking the threads and associated variables.
Although graphics processors and rendering techniques have improved by leaps and bounds in the past few decades, notably in mass produced, consumer-grade video game hardware, there is an ever-present need in the art for faster and more efficient 3D rendering.