The present invention pertains to the field of computer systems. More particularly, this invention pertains to the field of processing 2D graphics operations in graphics systems that utilize a tiled architecture.
Computer graphics systems are commonly used for displaying graphical representations of objects on a two-dimensional video display screen. Current computer graphics systems provide highly detailed representations and are used in a variety of applications.
In typical computer graphics systems, a three dimensional (3D) object to be represented on the display screen is broken down into graphics primitives. Typically, the primitives of a 3D object to be rendered are defined by a host computer in terms of primitive data. For example, when the primitive is a triangle, the host computer may define the primitive in terms of the X, Y and Z coordinates of its vertices, as well as the red, green and blue (R, G and B) color values of each vertex. Additional primitive data may be used in specific applications. Rendering hardware interpolates the primitive data to compute the display screen pixels that represent each primitive, and the R, G and B color values for each pixel.
Typical computer graphics systems further include a graphics cache memory. In order to make more efficient use of the graphics cache memory, 3D primitives are sorted into bins. This well-know technique is often referred to as xe2x80x9ctilingxe2x80x9d.
FIG. 1 and FIG. 2 show an example of sorting 3D primitives into bins, or xe2x80x9ctilingxe2x80x9d. For this example, a graphics controller receives data for primitives 110, 120, and 130. The primitives 110, 120, and 130 are to be rendered and then displayed on a display screen 100. When rendering a 3D primitive, the graphics controller reads an appropriate portion of display data from the graphics memory into the graphics cache memory. The graphics controller then renders the primitive and combines the rendered primitive with the display data stored in the graphics cache memory. The graphics memory may be located within main system memory.
In a non-tiled graphics architecture, if the graphics controller were to render primitive 110, then primitive 120, and then primitive 130, every time the graphics controller moved from one primitive to the next a new portion of display data would need to be retrieved from the graphics memory, resulting in many graphics cache misses and a greater utilization of graphics memory bandwidth.
In order to improve graphics memory bandwidth utilization, a tiling function is performed on the primitives 110, 120, and 130. The primitives 110, 120, and 130 of this example are sorted into bins 210, 220, 230, and 240, as shown in FIG. 2. The sorting technique generally involves a microprocessor analyzing which bins the various primitives intersect and then writing copies of the primitive data to the storage areas within main memory for the bins which the primitives intersect. The graphics controller then reads the primitive data out of the bin storage area and then divides the primitives to create the smaller primitives that fit into the various tiles. For example, primitive 110 is divided to create primitive 211 located in bin 210 and primitive 221 located within bin 220. Primitive 120 is divided to create primitive 222 located in bin 220 and primitive 242 located in bin 240. Primitive 130 is divided to create primitive 212 located in bin 210, primitive 231 located in bin 230, and primitive 241 located within bin 240.
Once the primitives are divided into the smaller primitives for a given bin, the bin can rendered. Typically, the graphics controller processed the bins one at a time. Because the appropriate display data for the each of the primitives located within a particular bin is stored in the same area of the graphics memory, fewer cache misses will result when rendering the primitives, resulting in an improvement in graphics memory bandwidth utilization.
However, it is also common in a typical graphics system for two-dimensional (2D) operations to be mixed in with 3D operations. For example, a microprocessor may receive primitive data for several 3D objects, then receive a command to perform a 2D blit operation, then receive more 3D primitive data.
FIG. 3 is a flow diagram describing how typical prior graphics systems handle 2D operations in a tiled architecture. At step 310, a processor receives 3D primitive data and sorts the primitives into bins. If a 2D blit operation is received at step 320, all of the bins that contain primitive data are flushed (sent to a graphics controller to be rendered). Then, at step 340, the 2D blit operation is performed. Following the 2D blit operation, the processor can then begin to sort additional 3D primitives into bins.
The flushing and rendering of the bins whenever a 2D operation is received may destroy, in large part, the benefits of tiling the 3D primitives due to an increase in graphics cache misses. The result is a greater utilization of graphics memory bandwidth. This resulting increase in graphics memory bandwidth utilization may be especially problematic in computer systems where a portion of system main memory is used as a graphics memory and many system agents desire access to the system main memory. An increase in main memory bandwidth utilization by the graphics controller may have a negative impact on overall system performance.