In a typical computer graphics system, an object to be represented on a display screen is broken down into graphics primitives. Graphics primitives are basic geometric elements such as points, lines, vectors, triangles and quadrilaterals. Computer graphics systems use graphics primitives in combination to represent more complex shapes. A typical system for generating and displaying graphics primitives might include a host processor, application and system/driver software running on the host processor, and a specialized subsystem of graphics processing hardware that is controlled by the software running on the host processor.
Many mathematical operations are necessary to process and display graphics primitives. In lower-end computer systems, most of those operations are performed by the host processor. In such lower-end systems, only a simple set of operations need be performed by the graphics subsystem in order to display the graphics information produced by the host processor. In higher-end computer systems, however, better performance is achieved by providing a graphics subsystem that has the capacity to perform many of the mathematical operations that, in lower-end systems, must be performed by the host processor. In such higher-end systems, the host processor may generate graphics information at a fairly abstract level. The host processor then relies on "graphics accelerator" hardware in the graphics subsystem to reduce the abstract information to simpler forms more suitable for downstream operations such as rasterization and storage in a frame buffer memory. In this manner, tasks are off loaded from the host processor, thereby saving host processor bandwidth for higher-level operations.
Various techniques have been employed to improve the performance of graphics accelerators. One such technique has been to include more than one graphics processor in the graphics accelerator architecture. Because graphics primitives vary, however, as to the number and type of computations necessary to process them, it is a challenge in multi-processor architectures to utilize processing power as effectively as possible for different kinds of primitives. For example, primitives may be generated by the host processor for display in a non-positional lighting mode, so that the graphics accelerator need only do cursory lighting operations along with the usual clipping, plane equation and transformation operations necessary for each primitive. In such a case, an effective allocation of accelerator processing power might be to have complete parallelism or some degree of pipelining, in which two graphics processors may work on two different primitives simultaneously. On the other hand, primitives may be generated by the host processor for display in a positional lighting mode, so that the graphics accelerator must perform numerous additional and more complex lighting calculations along with the usual clipping, plane equation and transformation operations necessary for each primitive. In the latter case, the same parallelism or pipelining scheme used for non-positionally lighted primitives may no longer utilize both graphics processors effectively, particularly if all of the graphics processors in the accelerator do not have the same capabilities. Thus, using such a scheme for positionally-lighted primitives would result in degraded accelerator performance.
It is therefore an object of the present invention to provide a graphics accelerator method and architecture that utilizes the bandwidth of multiple graphics processors very effectively not only for a single kind of graphics primitive and lighting mode, but also for a variety of different kinds of graphics primitives and lighting modes.