1. Field of the Invention
The present invention relates to a system for accelerating polygon scan conversion using a polygon span interpolator with main memory Z buffering, and more particularly, to a scan converter having a polygon span interpolator which uses main memory Z buffering in order to reduce costs while maintaining processing efficiency. Processing efficiency is maintained by implementing the polygon span interpolator as special free-running hardware responsive to special instructions from the CPU of the system.
2. Description of the Prior Art
Over the last several years, increasingly faster and less expensive computer graphics workstations have become available. Not surprisingly, this trend has contributed to a fast-growing market for workstations and related equipment. The focus of most of the attention by designers of such workstations has been improved price-performance ratios of the worstation's central processing units (CPU's) for general-purpose applications. However, other research efforts have been made to increase the graphics capabilities of such workstations by adding special-purpose graphics hardware. For example, Swanson et al. describe a VLSI device for accelerating the rendering of polygons in a graphics subsystem in an article entitled "A Fast Shaded-Polygon Renderer", Computer Graphics, Vol. 20, No. 4, August 1986, pp. 95-101. Torborg has explained in an article entitled "A Parallel Processor Architecture for Graphics Arithmetic Operations", Computer Graphics, Vol. 21, No. 4, July 1987, pp. 197-204, how parallel graphics arithmetic processors could be added to a workstation to enhance performance of the geometry portion of the graphics pipeline. Akeley et al. described in an article entitled "High-Performance Polygon Rendering", Computer Graphics, Vol. 22, No. 4, August 1988, pp. 239-246, a special graphics accelerator which can draw 100,000 quadrilaterals per second. Deering et al. described in an article entitled "The Triangle Processor and Normal Vector Shader: A VLSI System For High Performance Graphics", Computer Graphics, Vol. 22, No. 4, August 1988, pp. 21-30, a design for accelerating polygon rasterization with Phong shading. In addition, A. C. Barkans described in an article entitled "High Speed High Quality Antialiased Vector Generation", Computer Graphics, Vol 24, No. 4, August 1990, pp. 319-326, a VLSI device which speeds the drawings of anti-aliased lines. Typically, the above-described devices are attached to the host CPU to provide a level of performance which could not be achieved by using the CPU alone to do the graphics calculations. However, while such hardware add-ons have the advantage that almost any host CPU can support them, they significantly increase the price of the graphics system. It is desirable that such performance enhancements be made possible without the associated cost increases.
A more general-purpose approach has been demonstrated by Apgar et al. in an article entitled "A Display System for the Stellar.TM. Graphic Supercomputer Model GS1000.TM.", Computer Graphics, Vol. 22, No. 4, August 1988, pp. 255-262, and with respect to the HP/Apollo DN100000VS computer, in which graphics needs were considered in the overall design. These systems, sometimes called "super workstations", were crafted with an understanding of both the computational and the graphics needs of high-end users. For example, multiple central processing unit capabilities were included in both workstations. In the case of the Stellar.TM. system, data movement and internal rendering capability are key to its performance. The HP/Apollo system, on the other hand, added both floating-point capability (based on an analysis of graphics geometry code) and advanced rendering capabilities such as texture mapping and quadratic interpolation. Both superworkstations deliver graphics performance exceeding 100,000 polygons/second, but such systems are only available in the price range of high-end users. Less expensive alternatives are desired.
While the above-described systems solve specific application problems, they do not address the broad range of problems faced by most workstation users. A more general method is to approach the CPU design based on scalable graphics performance. One way to scale performance is to partition the graphics pipeline between what is implemented in the host CPU and what is implemented in specialized graphics hardware. Such partitioning is not the same for all applications. Since specialized hardware can, in general, perform a function more efficiently than a general-purpose CPU by off-loading some of the processing from the CPU (although at added expense), a tradeoff for each specific application is needed.
FIG. 1 illustrates a simplified graphics pipeline 100 and the implementation of key components. As shown, user data is stored in the workstation's virtual memory space as application database 102, typically in the form of a display list such as a hierarchical database consisting of basic graphics primitives such as lines and polygons. The first step in the rendering (drawing) process is to take user-specified viewing coordinates and apply those to the graphics primitives in geometry processor 104, where the image can be moved, zoomed, rotated, and clipped as desired. Scan converter 106 then takes the transformed and clipped vertices and converts them into screen-coordinate pixels for display. For example, in the case of lines, a series of pixels is drawn between the two end points. The graphics image is then stored in a special purpose memory such as an image buffer or frame buffer 108. Finally, the image stored in the image buffer 108 is displayed on display device 110, which is refreshed with a video stream from the image buffer 108. Typically, a color look-up table or color map is also placed between the image buffer 108 and the display device 110 for converting the frame buffer values into the colors to be displayed.
As will be apparent from the following description, the present invention is particularly directed to improvements in the scan converter 106 of the graphics pipeline 100 which allow for an optimized price--performance ratio.
Generally, existing scan conversion systems are of two types: hardware and software. For example, FIG. 2 illustrates a prior art scan conversion system 200 comprised of dedicated hardware. Such a scan converter takes the processed data from geometry processor 104 and separately interpolates the Z coordinate (depth) values and the color values for each point to be displayed on the display screen. In particular, the Z coordinate data is interpolated by Z interpolator 202 and cached in Z cache 204 for storage in dedicated Z buffer 206, which is typically a DRAM. On the other hand, the color data is interpolated in color interpolator 208 and cached in pixel cache 210 for storage in frame or image buffer 212 for subsequent display. In accordance with conventional techniques, the Z coordinate value corresponding to an input pixel is compared to the corresponding Z buffer value in order to determine whether the current pixel is closer to the viewer and is to be shown. If the current pixel is closer to the viewer, the new Z coordinate value is stored in the Z buffer 206 while the corresponding color data is passed along to the frame buffer 212 for display.
Although such a hardware scan conversion system operates quite efficiently, it does so at added expense to the computer components of the scan converter 200 significantly increase the cost of the computer graphics system. In addition, although such a dedicated Z buffer 206 typically eliminates the need to conduct the expensive, slow and complicated process of virtual direct memory access (VDMA) for Z buffering, the scan converter 200 must instead anticipate where data is coming from since it has no predictable virtual direct memory access path. The resulting system is thus expensive, somewhat complicated and without the flexibility of virtual direct memory access systems.
FIG. 3 illustrates a prior art software scan conversion system 300. As shown, a special purpose processing chip may be programmed to perform the scan conversion function. For example, the illustrated Intel i860 processor 302 is a graphics unit which may be programmed to process 3D graphics drawing algorithms as well as operations such as pixel shading and hidden surface elimination using a Z buffer. During operation, i860 processor 302 processes the data from the applications database (main memory 304) to perform geometric transformations, color interpolation and Z buffer adds and interpolation. The i860 processor 302 also uses scan conversion software responsive to the CPU clock states to perform the interpolations and the like. In particular, interpolation is performed under software control during respective clock states of i860 processor 302, thereby preventing the CPU from performing other calculations. As shown, Z buffer 306 is incorporated into main memory 304. The scan converted data is then passed to frame buffer 308 for display.
Although the system of FIG. 3 minimizes the use of dedicated hardware and is thus relatively inexpensive, the performance of such a software scan conversion system is limited because the CPU must perform all processing. It is desired to provide similar functionality without typing up the graphics processing unit to perform interpolation, VDMA and Z buffer compares.
Accordingly, an improved scan conversion technique is desired in which the processing speed and accuracy of hardware scan converters can be maintained while maintaining the cost effectiveness of a software scan conversion system. The present invention has been designed for this purpose.