1. Field of the Invention
The present invention relates generally to computer graphics systems and, more particularly, to a computer graphics system utilizing parallel processing of vertex data to achieve enhanced performance.
2. Related Art
Computer graphics systems are commonly used for displaying graphical representations of objects on a two-dimensional video display screen. Current computer graphics systems provide highly detailed representations and are used in a variety of applications.
In typical computer graphics systems, an object to be represented on the display screen is broken down into graphics primitives. Primitives are basic components of a graphics display and may include points, lines, vectors, and polygons such as triangles and quadrilaterals. Typically, a hardware/software scheme is implemented to render, or draw, the graphics primitives that represent a view of one or more objects being represented on the display screen. Generally, the primitives of the three-dimensional object to be rendered are defined by a host computer in terms of primitive data. For example, when the primitive is a triangle, the host computer may define the primitive in terms of the X, Y and Z coordinates of its vertices, as well as the red, green and blue (R, G and B) color values of each vertex. Additional primitive data may be used in specific applications. Rendering hardware interpolates the primitive data to compute the display screen pixels that represent each primitive, and the R, G and B color values for each pixel.
The basic components of a computer graphics system typically include a geometry accelerator, a rasterizer and a frame buffer. The system may also include other hardware such as texture mapping hardware. The geometry accelerator (GA) receives vertex data from the host computer that defines the primitives that make up the view to be displayed. The geometry accelerator performs transformations on the vertex data, decomposes quadrilaterals into triangles and performs lighting, clipping and plane equation calculations for each primitive. The output of the geometry accelerator, referred to as rendering data, is used by the rasterizer and the texture mapping hardware to generate final screen coordinate and color data for each pixel in each primitive. The pixel data from the rasterizer and the pixel data from the texture mapping hardware, if available, are combined and stored in the frame buffer for display on the video display screen.
The operations of the geometry accelerator are highly computation intensive. One frame of a three-dimensional (3-D) graphics display may include on the order of hundreds of thousands of primitives. To achieve state-of-the-art performance, the geometry accelerator may be required to perform several hundred million floating point calculations per second. Furthermore, the volume of data transferred between the host computer and the graphics hardware is very large. The data for a single quadrilateral may be on the order of 64 words of 32 bits each. Additional data transmitted from the host computer to the geometry accelerator includes lighting parameters, clipping parameters and any other parameters needed to generate the graphics display.
Various techniques have been employed to improve the performance of geometry accelerators, including pipelining and parallel processing. However, conventional graphic systems distribute the vertex data to the geometry accelerators in a manner that results in a non-uniform loading of the geometry accelerators. This variability in geometry accelerator utilization results in periods of time when one or more geometry accelerators are not processing vertex data when they are capable of doing so. Since the throughput of the graphics system is dependent upon the efficiency of the geometry accelerators, this inefficient use of the geometry accelerators' processing capabilities decreases the efficiency of the graphics system.
One conventional approach for distributing "chunks" of data to a parallel arrangement of geometry accelerators is described in U.S. patent application Ser. No. 08/634,458 entitled "Computer Graphics System Utilizing Parallel Processing For Enhanced Performance" to Shah et al., filed on Apr. 18, 1996, and owned by the assignee of the present application. With this technique, referred to as round-robin, hardware upstream of the geometry accelerators sends chunks of data to each of the geometry accelerators in a predetermined sequential order. This distribution sequence is repeated indefinitely until all of the vertex data has been processed. The hardware downstream of the geometry accelerators receives and combines the separate chunks of rendering data using the same predetermined sequence. That is, the geometry accelerators produce chunks of rendering data that are later combined to form a stream of rendering data representing a sequence of graphics primitives that is in the same order as the stream of graphics primitives represented by the stream of vertex data.
A drawback to this approach is that it does not take into consideration the fact that the geometry accelerators do not process all vertex data at the same rate. That is, certain types of vertex data require a greater time to be processed than others. If a geometry accelerator receives a chunk of vertex data that takes a substantial amount of time to process, the other geometry accelerators may become idle for a considerable period of time while the graphics system waits for that particular geometry accelerator to complete processing. Conversely, if a geometry accelerator receives a chunk of vertex data that takes a very short period of time to process, that geometry accelerator may complete its processing and become idle while it waits for the other geometry accelerators to complete the processing of their respective chunks of data. This variable processing time, coupled with the requirement that the chunks of rendering data must be combined in the same sequence as the corresponding chunks of vertex data, results in periods of time during which the geometry accelerators are not operating on vertex data. This inefficient utilization of the processing capabilities of the geometry accelerators adversely affects the efficiency of the graphics system.
This problem is exacerbated by an increase in the number of geometry accelerators in the graphics system. This is because the periodic interval at which all geometry accelerators receive vertex data increases with each additional geometry accelerator due to the sequential, round-robin approach described above. Thus, as the graphics system architecture is expanded to achieve greater processing capabilities, the efficiency of the resulting system is reduced.
It is therefore a principal objective of the present invention to provide a computer graphics system capable of efficiently distributing data to a plurality of geometry accelerators. Such a graphics system may then achieve improved throughput performance by reducing the inefficiencies associated with the parallel processing of vertex data.