The rendering of three-dimensional (3D) graphical images is of interest in a variety of electronic games and other applications. Rendering is the general term that describes the overall multi-step process of transitioning from a database representation of a 3D object to a two-dimensional projection of the object onto a viewing surface.
The rendering process involves a number of steps, such as, for example, setting up a polygon model that contains the information which is subsequently required by shading/texturing processes, applying linear transformations to the polygon mesh model, culling back facing polygons, clipping the polygons against a view volume, scan converting/rasterizing the polygons to a pixel coordinate set, and shading/lighting the individual pixels using interpolated or incremental shading techniques.
Graphics Processing Units (GPUs) are specialized integrated circuit devices that are commonly used in graphics systems to accelerate the performance of a 3D rendering application. GPUs are commonly used in conjunction with a central processing unit (CPU) to generate 3D images for one or more applications executing on a computer system. Modern GPUs typically utilize a graphics pipeline for processing data.
Prior art FIG. 1 shows a diagram depicting the various stages of a traditional prior art pipeline 100. The pipeline 100 is a conventional “deep” pipeline having stages dedicated to performing specific functions. A transform stage 105 performs geometrical calculations of primitives and may also perform a clipping operation. A setup/raster stage 110 rasterizes the primitives. A texture address 115 and texture fetch 120 stage are utilized for texture mapping. A fog stage 130 implements a fog algorithm. An alpha test stage 135 performs an alpha test. A depth test 140 performs a depth test for culling occluded pixels. An alpha blend stage 145 performs an alpha blend color combination algorithm. A memory write stage 150 writes the output of the pipeline.
The stages of the traditional GPU pipeline architecture illustrated in FIG. 1 are typically optimized for high-speed rendering operations (e.g., texturing, lighting, shading, etc.) using a widely implemented graphics programming API (application programming interface), such as, for example, the OpenGL™ graphics language, Direct3D™, and the like. The architecture of the pipeline 100 is configured as a multi-stage deep pipeline architecture in order to maximize the overall rendering throughput of the pipeline. Generally, deep pipeline architectures have sufficient data throughput (e.g., pixel fill rate, etc.) to implement fast, high quality rendering of even complex scenes.
There is an increasing interest in utilizing 3D graphics in portable handheld devices where cost and power consumption are important design requirements. Such devices include, for example, wireless phones, personal digital assistants (PDAs), and the like. However, the traditional deep pipeline architecture requires a significant chip area, resulting in greater cost than desired. Additionally, a deep pipeline consumes significant power, even if the stages are performing comparatively little processing. This is because many of the stages consume about the same amount of power regardless of whether they are processing pixels.
As a result of cost and power considerations, the conventional deep pipeline architecture illustrated in FIG. 1 is unsuitable for many graphics applications, such as implementing 3D games on wireless phones and PDAs. For example, such conventional deep pipelines are configured to compute the various parameters required to render the pixels of an object using multiple standardized, high precision functions. Typical per-pixel parameters include, for example, texture coordinates, colors, depth values (e.g., “z”), level of detail parameters, and the like. The functions are implemented such that they generate high precision results even in those circumstances where such precision is redundant or unnecessary.
The costs of such precision can be an expansion in the amount of data that must be pushed down the pipeline architecture, an increased number of transistors necessary to compute all parameter cases with the specified precision, an increased amount of circuit switching activity, and the like. Each of these costs run counter to the objective of implementing efficient high performance 3D rendering on a portable handheld device. Therefore, what is desired is a processor architecture suitable for graphics processing applications but with reduced power and size requirements.