1. Field of the Invention
The present invention relates to computer graphics systems and, more particularly, to a computer graphics system utilizing a graphics accelerator having an enhanced logic and register structure to achieve enhanced performance.
2. Discussion of the Related Art
Computer graphics systems are commonly used for displaying graphical representations of objects on a two-dimensional video display screen. Current computer graphics display systems provide highly detailed representations and are used in a variety of applications. A computer graphics display system generally comprises a central processing unit (CPU), system memory, a graphics machine and a video display screen.
In typical computer graphics display systems, an object to be presented on the display screen is broken down into graphics primitives. Primitives are basic components of a graphics display and may include points, lines, vectors and polygons (e.g., triangles and quadrilaterals). Typically, a hardware/software scheme is implemented to render, or draw, the graphics primitives that represent a view of one or more objects being represented on the display screen.
Generally, the primitives of the three-dimensional object to be rendered are defined by the host CPU in terms of primitive data. For example, when the primitive is a triangle, the host computer may define the primitive in terms of the X, Y and Z coordinates of its vertices, as well as in terms of the red, green, blue and alpha (R, G, B and .alpha.) color values of each vertex. Alpha is a transparency value. Additional primitive data may be used in specific applications. Rendering hardware interpolates the primitive data to compute the display screen pixels that represent each primitive, and the R, G, B and .alpha. values for each pixel.
The graphics machine generally includes a geometry accelerator, a rasterizer, a frame buffer controller and a frame buffer. The graphics machine may also include texture mapping hardware. The geometry accelerator receives vertex data from the host CPU that defines the primitives that make up the view to be displayed. The geometry accelerator typically comprises a transform component which receives vertex data from the CPU, a clipping component, an illumination component, and a plane equations component. The transform component performs transformations on the vertex data received from the CPU, such as rotation and translation of the image space defined by vertex data. The clipping component clips the vertex data so that only vertex data relating to primitives that make up the portion of the view that will be seen by the user is kept for further processing. The illumination or lighting component calculates the final colors of the vertices of the primitives based on the vertex data and based on lighting conditions. The plane equations component generates floating point equations which define the image space within the vertices. The floating point equations are later converted into fixed point equations and the rasterizer and texture mapping hardware generate the final screen coordinate and color data for each pixel in each primitive.
The operations of the geometry accelerator are computationally very intense. One frame of a three-dimensional (3-D) graphics display may include on the order of hundreds of thousands of primitives. To achieve state-of-the-art performance, the geometry accelerator may be required to perform several hundred million floating point calculations per second. Furthermore, the volume of data transferred between the host computer and the graphics hardware is very large. The data for a single quadrilateral may be on the order of, for example, 64 words of 32 bits each. Additional data transmitted from the host computer to the geometry accelerator includes illumination parameters, clipping parameters and any other parameters needed to generate the graphics display.
Various techniques have been employed to improve the performance of geometry accelerators. These including pipelining, parallel processing, reducing redundance, minimizing computations, etc. in a graphics accelerator. For example, conventional graphic systems are known to distribute the vertex data to the geometry accelerators in a manner that results in a non-uniform loading of the geometry accelerators. This variability in geometry accelerator utilization results in periods of time when one or more geometry accelerators are not processing vertex data when they are capable of doing so. Since the throughput of the graphics system is dependent upon the efficiency of the geometry accelerators, this inefficient use of the processing capabilities decreases the efficiency of the graphics system. In response to this shortcoming in the prior art, a solution was developed for distributing "chunks" of data to a parallel arrangement of geometry accelerators.
Another known way of improving the throughput of a geometry accelerator is to minimize the overall amount of data that must be processed by it. One way that this has been done is to minimize redundancy in the data being sent to the geometry accelerator. While these and other techniques are known for improving the performance of geometry accelerators, further improvements are desired.
For example, it has been found that during the execution of various state machines of a geometry accelerator, there are often periods where the execution of one or more states is delayed until the execution of another state machine as completed. For example, and as will be discussed below, a geometry accelerator is generally laid out in pipelined fashion. As listed above, the principal components of a geometry accelerator include a transform block or routine, a clipping routine, a lighting routing, a plane equation routine, etc. These components are often implemented as state machines, which execute in pipelined fashion.
It has been found that execution time is often lost during the period that the primitive data is being passed from the transform state machine to the next machine in the pipeline (e.g., the clipping state machine). Alternatively, primitive data may be passed directly from the transform state machine directly to the lighting state machine or the plane equation state machine. Although the clipping state machine may be functionally adjacent the transform state machine in the pipeline, sometimes data need not be operated upon by the clipping state machine, and instead may be routed around it; for example, if a graphic primitive is entirely off screen. If the primitive is entirely off the screen, then the next primitive in the pipeline can begin processing in the transform machine. Alternatively, if the primitive is entirely on the screen and unlit, then the plane equation machine may immediately be started. Regardless of which state machine the primitive data is passed to, it has been found that there is generally some loss in time, or states, and thus the geometry accelerator sacrifices efficiency.
Accordingly, there is a desire to streamline the processing within a geometry accelerator to improve its efficiency. More specifically, there is a desire to structure a geometry accelerator to minimize the lost time, or states, between the pipelined operation of state machines.