I. Field of the Invention
The present invention relates generally to data processing and more specifically to an improved apparatus for performing ternary operations of the type A.times.B+C in a floating point unit.
II. Description of the Prior Art
The processing of floating point computations is important to modern computer operation. Experience shows that general purpose processors are not well suited to the performance of floating point computations, and as a result, specialized floating point units (FPU) or processors have been developed to handle numerically intensive computations.
The potential users of floating point hardware span the range from desktop microcomputers, through signal processing and parallel processing systems to the largest mainframes.
A floating point unit may be required to perform various mathematical operations on floating point numbers such addition, subtraction, multiplication and division. Some floating point hardware also provides built in features to support other mathematical operations such as computation of transcendental functions.
Since it is always useful to maximize the speed with which a floating point processor performs its functions, one known technique used to obtain performance gains is to provide specialized hardware to implement specific floating point functions. For example, certain combinations of arithmetic functions occur regularly in computations. The present invention is directed to an apparatus for use in a floating point processor optimized for the computation of expressions of the form A.times.B+C. Various important mathematical concepts involve computations of this type, such as, for example dot products of the form ##EQU1## and Horner's Rule evaluations where: EQU Ax.sup.3 +Bx.sup.2 +Cx+D=D+x(C+x(B+Ax).
Many floating point hardware units are implemented using VLSI (Very Large Scale Integration) for which the designer of VLSI FPUs often must consider the amount of space taken by specific functions and also optimization of FPU performance by maximizing its speed. Traditional FPU design has used separate multiply and add hardware units and a method for connecting the two units when the frequent multiply-add (A.times.B+C) operation was required. Fast multiplication requires a fast adder in its final stage, as shown in "A suggestion for a fast multiplier", by C. S. Wallace, IEEE Transactions on Computers, EC-13, Feb, 1964 pp14-17.
For High performance design, hardware to perform (A.times.B+C) requires:
2 fast adders (1 for the multiplication operation and 1 for addition) PA1 2 round devices (1 for the multiplication operation and 1 for addition) PA1 4 input ports (2 for the multiplication operation and 2 for addition) PA1 2 output ports (1 for the multiplication operation and 1 for addition) PA1 2 instruction (1 for the multiplication operation and 1 for addition
This invention will reduce required elements by a merge of the multiplication operator and the addition operator.