Referring to FIGS. 3 and 4, in a SIMD (Single Instruction Multiple Data) machine, several processors (e.g. P0-P3) execute the same instruction while operating on different data. The instruction and data are fetched from a memory 1 and input to the processors via a memory controller 2, a shared bus or cross bar switch arrangement 3, and an instruction data dispatcher/gatherer 4. The processors then operate in a lock step fashion and output their results to target registers or to the memory 1, depending on the instruction. A processor may be removed from a computation by setting a mask bit for that processor. Typically, each processor contains its own set of registers. The processors may be realized as separate devices, or may be incorporated into a single chip.
A SIMD machine may implement the geometry pipeline of a graphics system. In this implementation, the vertices (on which the geometric processing is to be performed) are partitioned into groups each of which corresponds to a processor of the SIMD machine. The memory controller 2 loads vertex data (x,y,z,w coordinates, Nx, Ny, Nz normal coordinates) from the memory 1 to the assigned processor (P0-P3) to perform the geometric processing. The assigned processor performs the geometric processing (e.g., coordinate transformation, lighting, clipping, perspective projection) on the vertex data. Typically, a reciprocal 1/x operation is used for perspective projection of a vertex, for texture mapping calculations, and for color and other parameter slope calculations such as texture coordinates, alpha (transparency), and depth. A reciprocal 1/sqrt(x) operation is used for normalization of the vertex normal, light vectors, etc.
The 1/x and 1/sqrt(x) operations are each broken into two phases, the seed (i.e., initial estimate) phase and the refinement phase, which are referred to as the seed instruction and the refinement instruction, respectively. The refinement phase may have several refinement instructions, depending on the desired accuracy of the result. The processing sequence for 1/x may be as follows:
______________________________________ recip.sub.-- seed x, target refine.sub.-- recip seed, target refine.sub.-- recip seed, target refine.sub.-- recip seed, target ______________________________________
The above sequence may be preceded and followed by other conventional instructions.
The operations 1/x and 1/.sqroot.x occur very frequently in the geometry processing part of the graphics pipeline, and their execution consumes a considerable amount of processing time and resources.
As was indicated above, the 1/x operation can be used for perspective projection of geometric objects, and the 1/.sqroot.x operation can be used to normalize vectors that are used in performing lighting calculations. These operations are executed at least once per vertex and collectively consume about 20% of the overall vertex processing time (if no special techniques are used for this purpose). If perspective correct texture mapping is also performed, this figure may rise to 30% of the overall processing time. Thus, it is important to execute 1/x and 1/.sqroot.x operations efficiently to speed up the vertex processing rate in the geometry pipeline.
It is known in the art to employ ROM-based lookup tables to speed the 1/x and 1/sqrt calculations. However, this technique is not efficient for SIMD architectures, due to the large size of each table.
In U.S. Pat. No. 5,457,779 Harvell discloses a conventional SIMD machine used for a computer display system, in particular a four processor or geometry engine embodiment. Col. 3 of this patent makes reference to the above-described 1/x computation.
It can be appreciated that it would be desirable to provide an efficient and fast technique to execute the 1/x and 1/sqrt(x) operations so as to decrease the total processing time required to render complex geometric objects in a computer display system. This invention addresses this long felt need.