The rendering of three-dimensional graphical images is of interest in a variety of electronic games and other applications. Rendering is the general term that describes the overall multi-step process of transitioning from a database representation of a three-dimensional object to a two-dimensional projection of the object onto a viewing surface.
The rendering process involves a number of steps, such as, for example, setting up a polygon model that contains the information which is subsequently required by shading/texturing processes, applying linear transformations to the polygon mesh model, culling back facing polygons, clipping the polygons against a view volume, scan converting/rasterizing the polygons to a pixel coordinate set, and shading/lighting the individual pixels using interpolated or incremental shading techniques.
Graphics Processing Units (GPUs) are specialized integrated circuit devices that are commonly used in graphics systems to accelerate the performance of a 3-D rendering application. GPUs are commonly used in conjunction with a central processing unit (CPU) to generate three-dimensional images for one or more applications executing on a computer system. Modern GPUs typically utilize a graphics pipeline for processing data.
Prior art FIG. 1 shows a diagram depicting the various stages of a traditional prior art pipeline 100. The pipeline 100 is a conventional “deep” pipeline having stages dedicated to performing specific functions. A transform stage 105 performs geometrical calculations of primitives and may perform a clipping operation. A setup/raster stage 110 rasterizes the primitives. A texture address 115 and texture fetch 120 stage are utilized for texture mapping. A fog stage 130 implements a fog algorithm. An alpha test stage 135 performs an alpha test. A depth test 140 performs a depth test for culling occluded pixels. An alpha blend stage 145 performs an alpha blend color combination algorithm. A memory write stage 150 writes the output of the pipeline.
The stages of the traditional GPU pipeline architecture illustrated in FIG. 1 are typically optimized for high-speed rendering operations (e.g., texturing, lighting, shading, etc.) using a widely implemented graphics programming API (application programming interface), such as, for example, the OpenGL™ graphics language, Direct3D™, and the like. The architecture of the pipeline 100 is configured as a multi-stage deep pipeline architecture in order to maximize the overall rendering throughput of the pipeline. Generally, deep pipeline architectures have sufficient data throughput (e.g., pixel fill rate, etc.) to implement fast, high quality rendering of even complex scenes.
There is an increasing interest in utilizing three-dimensional graphics in portable handheld devices where cost and power consumption are important design requirements. Such devices include, for example, wireless phones, personal digital assistants (PDAs), and the like. However, the traditional deep pipeline architecture requires a significant chip area, resulting in greater cost than desired. Additionally, a deep pipeline consumes significant power, even if the stages are performing comparatively little processing. This is because many of the stages consume about the same amount of power regardless of whether they are processing pixels.
As a result of cost and power considerations, the conventional deep pipeline architecture illustrated in FIG. 1 is unsuitable for many graphics applications, such as implementing three-dimensional games on wireless phones and PDAs. Therefore, what is desired is a processor architecture suitable for graphics processing applications but with reduced power and size requirements.
Prior to the traditional start of the graphics architecture pipeline, e.g., the rasterizing module, the power draw and the number of gates utilized by the upcoming pipeline are not considered. However, not considering the power and gates used by handheld computing devices will deleterious effect their overall operation. Specifically, the power and gate usage factors will result in extremely poor operation of the handheld device and any graphics shown thereon. One of the major contributors to power draw and gate utilization is the pixel packet size. That is, the size of the data per pixel passing through the pipeline which effects the bus size, for instance, and also the complexity of the circuitry of each pipeline stage. In normal applications, the power consumption is unimportant and therefore, the pixel packet may be extremely large. Moreover, portions of the graphics pipeline may remain idle for extended periods while the large pixel packet is processed at different stages. Due to the fixed-function of the raster stage 110, a great power loss and gate use may occur during the pixel data processing.