Graphics processing is an important feature of modern high performance computing systems. In graphic processing, mathematical procedures are implemented to render, or draw, graphic primitives, e.g., a triangle or a rectangle, on a display to produce desired visual images. Real time graphics processing is based on the high speed processing of graphic primitives to produce visually pleasing moving images. Early graphic systems were limited to displaying image objects comprised of graphic primitives having smooth surfaces. That is, visual textures, bumps, scratches, or other surface features were not modeled in the graphics primitives. To enhance image quality, texture mapping of real world attributes was introduced. In general, texture mapping is the mapping of an image onto a graphic primitive surface to create the appearance of a complex image without the high computational costs associated with rendering actual three dimensional details of an object.
Graphics processing is typically performed using graphics application program interfaces (API's) that provide a standard software interface that can be run on multiple platforms, operating systems, and hardware. Examples of graphics API's include the Open Graphics Library (OpenGL®) and D3D™. In general, such open graphics application programs include a predetermined, standardized set of commands that are executed by associated hardware. For example, in a computer system that supports the OpenGL® standard, the operating system and application software programs can make calls according to that standard without knowing any of the specifics regarding the system hardware. Application writers can use graphics APIs to design the visual aspects of their applications without concern as to how their commands will be implemented.
Graphics APIs are particularly beneficial when they are supported by dedicated graphics hardware. In fact, high speed processing of graphical images is often performed using special graphics processing units (GPUs) that are fabricated on semiconductor substrates. Beneficially, a GPU can be designed and used to rapidly and accurately process graphics commands with little impact on other system resources.
FIG. 1 illustrates a simplified block diagram of a graphics system 100 that includes a graphics processing unit 102. As shown, that graphics processing unit 102 has a host interface/front end 104. The host interface/front end 104 receives raw graphics data from central processing hardware 103 that is running an application program stored in memory 105. The host interface/front end 104 buffers input information and supplies that information to a geometry engine 106. The geometry engine 106 produces, scales, rotates, and projects three dimensional vertices of graphics primitives in “model” coordinates into 2 dimensional frame buffer coordinates. Typically, triangles are used as graphics primitives for three dimension objects, but rectangles are often used for 2-dimensional objects (such as text displays).
The 2 dimensional frame buffer co-ordinates of the vertices of the graphics primitives from the geometry engine 106 are applied to a rasterizer 108. The rasterizer 108 determines the positions of all of the pixels within the graphics primitives. This is typically performed along raster (horizontal) lines that extend between the lines that define the graphics primitives. The rasterizer 108 also generates interpolated colors, depths and other texture coordinates for each pixel. The output of the rasterizer 108 is referred to as rasterized pixel data.
The rasterized pixel data are applied to a shader 110 that adds texture and optical features related to fog and illumination to the rasterized pixel data to produce shaded pixel data. The shader 110 includes a texture engine 112 that modifies the rasterized pixel data to have desired texture and optical features. The texture engine 112 can be implemented using a hardware pipeline that can process large amounts of data at very high speed. The shaded pixel data is input to a Raster Operations Processor 114 (Raster op in FIG. 1) that performs vertex processing on the shaded pixel data. The result from the Raster Operations Processor 114 is frame pixel data that is stored in a frame buffer memory 120 by, a frame buffer interface 116. The frame pixel data can be used for various processes such as being displayed on a display 122. Frame pixel data can be made available as required by way of the frame buffer interface 116.
Hardwired pipeline shaders 110 are known. For example, hardwired pixel pipelines have been used to perform standard API functions, including such functions as scissor, Alpha test; zbuffer, stencil, blendfunction; logicop; dither; and writemask. Also known are programmable shaders 110 that enable an application writer to control shader operations. For example, reference U.S. patent application Ser. No. 10/391,930, filed on Mar. 19, 2003, and entitled “System Method and Computer Program Product for Branching During Programmable Vertex Processing,” by Lindholm et al.; U.S. patent application Ser. No. 09/586,249, filed on May 31, 2000, and entitled “System Method and Article of Manufacture for Programmable Vertex Processing Model with Instruction Set,” by Lindholm et al.; and U.S. patent application Ser. No. 09/273,975, filed on Mar. 22, 1999, and entitled “Programmable Pixel Shading Architecture,” by Kirk et al., all of which are hereby incorporated by reference in their entirety.
Programmable shaders enable flexibility in the achievable visual effects and can reduce the time between a graphics function being made available and that function becoming standardized as part of a graphics API. Programmable shaders can have a standard API mode in which standard graphics API commands are implemented and a non-standard mode in which new graphics features can be programmed.
While shaders have proven themselves to be useful, demands for shader performance have exceeded the capabilities of existing shaders. While improving existing shaders could address some of the demands, such improvements would be difficult to implement. Furthermore, future demands can be anticipated to exceed the capabilities achievable by improved existing shaders. Therefore, a new method of improving shader performance would be beneficial.