The rendering of three-dimensional graphical images is of interest in a variety of electronic games and other applications. Rendering is the general term that describes the overall multi-step process of transitioning from a database representation of a three-dimensional object to a two-dimensional projection of the object onto a viewing surface.
The rendering process involves a number of steps, such as, for example, setting up a polygon model that contains the information which is subsequently required by shading/texturing processes, applying linear transformations to the polygon mesh model, culling back facing polygons, clipping the polygons against a view volume, scan converting/rasterizing the polygons to a pixel coordinate set, and shading/lighting the individual pixels using interpolated or incremental shading techniques.
Graphics Processing Units (GPUs) are specialized integrated circuit devices that are commonly used in graphics systems to accelerate the performance of a 3-D rendering application. GPUs are commonly used in conjunction with a central processing unit (CPU) to generate three-dimensional images for one or more applications executing on a computer system. Modern GPUs typically utilize a graphics pipeline for processing data.
Prior art FIG. 1 shows a diagram depicting the various stages of a traditional prior art pipeline 100. The pipeline 100 is a conventional “deep” pipeline having stages dedicated to performing specific functions. A transform stage 105 performs geometrical calculations of primitives and may perform a clipping operation. A setup/raster stage 110 rasterizes the primitives. A texture address 115 and texture fetch 120 stage are utilized for texture mapping. A fog stage 130 implements a fog algorithm. An alpha test stage 135 performs an alpha test. A depth test 140 performs a depth test for culling occluded pixels. An alpha blend stage 145 performs an alpha blend color combination algorithm. A memory write stage 150 writes the output of the pipeline.
The stages of the traditional GPU pipeline architecture illustrated in FIG. 1 are typically optimized for high-speed rendering operations (e.g., texturing, lighting, shading, etc.) using a widely implemented graphics programming API (application programming interface), such as, for example, the OpenGL™ graphics language, Direct3D™, and the like. The architecture of the pipeline 100 is configured as a multi-stage deep pipeline architecture in order to maximize the overall rendering throughput of the pipeline. Generally, deep pipeline architectures have sufficient data throughput (e.g., pixel fill rate, etc.) to implement fast, high quality rendering of even complex scenes.
There is an increasing interest in utilizing three-dimensional graphics in portable handheld devices where cost and power consumption are important design requirements. Such devices include, for example, wireless phones, personal digital assistants (PDAs), and the like. However, the traditional deep pipeline architecture requires a significant chip area, resulting in greater cost than desired. Additionally, a deep pipeline consumes significant power, even if the stages are performing comparatively little processing. This is because many of the stages consume about the same amount of power regardless of whether they are processing pixels.
As a result of cost and power considerations, the conventional deep pipeline architecture illustrated in FIG. 1 is unsuitable for many graphics applications, such as implementing three-dimensional games on wireless phones and PDAs. Therefore, what is desired is a processor architecture suitable for graphics processing applications but with reduced power and size requirements.
Prior to the traditional start of the graphics architecture pipeline, e.g., the rasterizing module, the power draw and the number of gates utilized by the upcoming pipeline are not considered. With respect to handheld computing devices not considering the power and gates used will deleterious effect overall operation. Specifically, the power and gate usage factors will result in extremely poor operation of the handheld device and any graphics shown thereon. One of the major contributors to power draw and gate utilization is color in an environment. In some cases, the graphics architecture may utilize 8, 16, or even 32-bit color throughout the pipeline to produce the desired graphics. In a device with limited power, e.g., a handheld computing device, a pipeline filled with that many bits of color will require many clock cycles to process and resolve the color value.