In the implementation of graphics display systems for digital computers, it is sometimes desirable to have dedicated hardware support for geometry calculations in addition to the more common support for triangle setup and rasterization. Because graphics display systems often involve the display of objects based on three-dimensional data describing the objects, the geometry calculations involve, among other things, transforming locations of objects expressed in three-dimensional world coordinates into locations expressed in two-dimensional coordinates as the objects appear on the display. For some applications and configurations of graphics systems, the processing capability of the geometry accelerator becomes critically important. In the simplest case, geometry computations are accomplished one coordinate at a time, one vertex at a time, one triangle a time, one triangle strip at a time.
Data presented to a computer graphics subsystem are often expressed as strips of polygons (often triangles) in accordance with a graphics processing standard, such as the well known OpenGL graphics library. Rendering a scene involves transforming the coordinates of all of the polygons in all of the strips and determining the pixel values in the display that are associated with each portion of each of the polygons that appears in the display. The large amount of data involved in these calculations, in relation to the conflicting goals of achieving rendering both quickly and in detail, places heavy demands on computational resources.
Substantial opportunities exist for parallel computation by breaking up the triangle strips and presenting the resulting sub-strips to different computation engines in parallel. THE REALITY ENGINE, distributed by Silicon Graphics, Inc. of Mountain View, Calif., and the GLZ family of graphics accelerators, distributed by INTENSE 3D of Huntsville, Ala., are examples of systems that employ this technique extensively. In these systems, once the strips are broken up, the sub-strips are passed to standard processor elements, where the rest of the computation takes place basically one coordinate at a time, one vertex at a time. In the Reality Engine, these computations are done with an i860 processor from Intel. In the GLZ family of graphics accelerators, these computations are done with DSP chips from Analog Devices of Norwood, Mass. In systems like these, some limited parallelism takes place in the coordinate transformations because the computation engines employed are pipelined math units with separate engines for integer and floating point calculations.
In U.S. Pat. No. 5,745,125, assigned to Sun Microsystems, separate specialized computation engines are arranged in series to form a deeper pipeline than would normally occur.
It is a known goal in computer design to employ very large instruction words (VLIW) for achieving increased parallelism in computation. To make it practical to program such computers, high level programming languages are devised that employ instructions utilizing a register-to-register type of instruction set. The effect of a successful VLIW machine is to launch and complete a great many instructions on each clock cycle, so the register-to-register instruction set requires a register file with many read ports and many write ports. For example, U.S. Pat. No. 5,644,780, assigned to International Business Machines, describes a register file for VLIW with 8 write ports and 12 read ports. The result is a VLIW computation engine capable of high levels of parallelism, but which can be built only at great cost that requires many registers.