Graphics processing is an important feature of modern high performance computing systems. In graphic processing, mathematical procedures are implemented to render, or draw, graphic primitives, e.g., a triangle or a rectangle, on a display to produce desired visual images. Real time graphics processing requires high speed processing of graphic primitives to produce visually pleasing moving images.
The rendering of three-dimensional graphical images is of interest in a variety of electronic games and other applications. Rendering is the general term that describes the overall multi-step process of transitioning from a database representation of a three-dimensional object to a two-dimensional projection of the object onto a viewing surface, e.g., computer display.
The rendering process involves a number of steps, such as, for example, setting up a polygon model that contains the information which is subsequently required by shading/texturing processes, applying linear transformations to the polygon mesh model, culling back facing polygons, clipping the polygons against a view volume, scan converting/rasterizing the polygons to a pixel coordinate set, and shading/lighting the individual pixels using interpolated or incremental shading techniques.
Graphics Processing Units (GPUs) are specialized integrated circuit devices that are commonly used in graphics systems to accelerate the performance of a 3-D rendering application. GPUs are commonly used in conjunction with a central processing unit (CPU) to generate three-dimensional images for one or more applications executing on a computer system. Modern GPUs typically utilize a graphics pipeline for processing data.
Prior Art FIG. 1 illustrates a simplified block diagram of a graphics system 100 that includes a graphics processing unit 102. As shown, that graphics processing unit 102 has a host interface/front end 104. The host interface/front end 104 receives raw graphics data from central processing hardware 103 that is executing an application program stored in memory 105. The host interface/front end 104 buffers input information and supplies that information to a geometry engine 106. The geometry engine 106 produces, scales, rotates, and projects three dimensional vertices of graphics primitives in “model” coordinates into 2 dimensional frame buffer coordinates. Typically, triangles are used as graphics primitives for three dimension objects, but rectangles are often used for 2-dimensional objects (such as text displays).
The 2 dimensional co-ordinates of the vertices of the graphics primitives are supplied to a rasterizer 108. The rasterizer 108 determines the positions of all of the pixels within the graphics primitives. This is typically performed along raster (horizontal) lines that extend between the lines that define the graphics primitives. The rasterizer 108 also generates interpolated colors, depths and other texture coordinates for each pixel. The output of the rasterizer 108 is referred to as rasterized pixel data.
The rasterized pixel data are applied to a shader 110 that adds texture, color, and optical features related to fog and illumination to the rasterized pixel data to produce shaded pixel data. The shader 110 includes a texture engine 112 that modifies the rasterized pixel data to have desired texture and optical features. The texture engine 112 can be implemented using a hardware pipeline that can process large amounts of data at very high speed. The shaded pixel data is input to a Raster Operations Processor 114 (Raster op in FIG. 1) that performs color blending on the shaded pixel data. The result from the Raster Operations Processor 114 is frame pixel data that is stored in a frame buffer memory 120 by a frame buffer interface 116. The frame pixel data can be used for various processes such as being displayed on a display 122. Frame pixel data can be made available as required by way of the frame buffer interface 116.
The stages of the traditional GPU pipeline architecture illustrated in FIG. 1 may be typically optimized for high-speed rendering operations (e.g., texturing, lighting, shading, etc.) using a widely implemented graphics programming API (application programming interface), such as, for example, the OpenGL™ graphics language, Direct3D™, and the like. The architecture of the graphics processing unit 102 is configured as a multi-stage deep pipeline architecture in order to maximize the overall rendering throughput of the pipeline. Generally, deep pipeline architectures have sufficient data throughput (e.g., pixel fill rate, etc.) to implement fast, high quality rendering of even complex scenes.
A particular issue in the GPU processing unit of FIG. 1 is that the scheduling of high level instructions (operational codes) in the shader 110 is inflexibly hard coded into the chip platform that contains the GPU processing unit 102. The scheduling process translates a stream of high level instructions (e.g., operational codes) into a very long instruction word (VLIW) that is executed in the shader 110. However, errors within the scheduling process are difficult to repair. Moreover, modifications to the scheduling process generally cannot be made. As a result, performance within the shader 110 suffers because of the limitations introduced by errors in the scheduling process. Therefore, what is desired is a scheduling process that is suitable for repair and modification.