Recent advances in computer performance have enabled graphic systems to provide more realistic graphical images using personal computers, home video game computers, handheld devices, and the like. In such graphic systems, a number of procedures are executed to “render” or draw graphic primitives to the screen of the system. A “graphic primitive” is a basic component of a graphic picture, such as a point, line, polygon, or the like. Rendered images are formed with combinations of these graphic primitives. Many procedures may be utilized to perform 3-D graphics rendering.
Specialized graphics processing units (e.g., GPUs, etc.) have been developed to optimize the computations required in executing the graphics rendering procedures. The GPUs are configured for high-speed operation and typically incorporate one or more rendering pipelines. Each pipeline includes a number of hardware-based functional units that are optimized for high-speed execution of graphics instructions/data, where the instructions/data are fed into the front end of the pipeline and the computed results emerge at the back end of the pipeline. The hardware-based functional units, cache memories, firmware, and the like, of the GPU are optimized to operate on the low-level graphics primitives (e.g., comprising “points”, “lines”, “triangles”, etc.) and produce real-time rendered 3-D images.
The real-time rendered 3-D images are generated using raster display technology. Raster display technology is widely used in computer graphics systems, and generally refers to the mechanism by which the grid of multiple pixels comprising an image are influenced by the graphics primitives. For each primitive, a typical rasterization system generally steps from pixel to pixel and determines whether or not to “render,” or write a given pixel into a frame buffer or pixel map, as per the contribution of the primitive. This, in turn, determines how to write the data to the display buffer representing each pixel.
Various traversal algorithms and various rasterization methods have been developed for computing from a graphics primitive based description to a pixel based description (e.g., rasterizing pixel to pixel per primitive) in a way such that all pixels within the primitives comprising a given 3-D scene are covered. For example, some solutions involve generating the pixels in a unidirectional manner. Such traditional unidirectional solutions involve generating the pixels row-by-row in a constant direction. This requires that the sequence shift across the primitive to a starting location on a first side of the primitive upon finishing at a location on an opposite side of the primitive.
Other traditional methods involve utilizing per pixel evaluation techniques to closely evaluate each of the pixels comprising a display and determine which pixels are covered by which primitives. The per pixel evaluation involves scanning across the pixels of a display to determine which pixels are touched/covered by the edges of a graphics primitive.
Once the primitives are rasterized into their constituent pixels, these pixels are then processed in pipeline stages subsequent to the rasterization stage where the rendering operations are performed. Generally, these rendering operations assign a color to each of the pixels of a display in accordance with the degree of coverage of the primitives comprising a scene. The per pixel color is also determined in accordance with texture map information that is assigned to the primitives, lighting information, and the like.
Various traversal algorithms have been developed for moving from pixel to pixel in a way such that all pixels within the primitive are covered. For example, some solutions involve generating the pixels in a unidirectional manner. Such traditional unidirectional solutions involve generating the pixels row-by-row in a constant direction. This requires that the sequence shift across the primitive to a starting location on a first side of the primitive upon finishing at a location on an opposite side of the primitive. Each time this shift is executed, pixels or texture values are stored which were not positioned adjacent to pixels or texture values processed immediately beforehand. Therefore, such distant pixels or texture values have a greater chance of belonging to different memory access blocks, making such access inefficient.
Less efficient access imposes a number of performance penalties on the graphics rendering system. Operating on distant pixels or distant texture values leads to a large number of pixel data fetches or texture data fetches from the frame buffer memory. This causes a correspondingly large amount of frame buffer memory bandwidth traffic. The excess frame buffer memory traffic contends with other graphics function units that need to access the frame buffer memory. The performance penalty is even more severe in those cases where anti-aliasing is implemented. For example, many anti-aliasing techniques utilize a plurality of subpixel sample points in order to more accurately determine fragment coverage per pixel. What is particularly problematic is the fact that the multiple number of sample points per pixel can greatly increase the amount of excess frame buffer memory traffic.
Thus, a need exists for a rasterization process that can ensure needed graphics rendering data (e.g., texture values, normal maps, etc.) can be maintained in memory for an efficient access by the GPU.