Multipliers are used in many digital signal processing operations, such as correlations, convolution, filtering and frequency analysis, to perform multiplications. The most basic form of multiplication consists of forming the product of two positive binary numbers. This can be accomplished through the traditional technique of successive additions and shifts in which each addition is conditional on one of the multiplier bits. Thus, the multiplication process may be viewed as consisting of the following two steps:
1. Evaluation of partial products. PA1 2. Accumulation of the shifted partial products.
It should be noted that binary multiplication is equivalent to a logical AND operation. Thus, evaluation of partial products consists of the logical ANDing of the multiplicand and the relevant multiplier bits. Each column of partial products must then be added and, if necessary, any carry values passed to the next column. There are a number of techniques that may be used to perform multiplication.
One such technique is the Radix-n Multiplication system. A Radix-2 system computes the partial products by observing one bit of the multiplicand at a time. Higher radix multipliers can be designed to reduce the number of adders and hence the delay required to compute the partial sums. The best known method is the Booth's algorithm which is a Radix-4 multiplication scheme. Booth's algorithm for a signed binary multiplication technique, as modified by MacSorley, is widely used to design fast multipliers in computer hardware. Reference is made to FIG. 2 which describes the operations required by the modified-Booth algorithm as known in the art.
A multiplier implemented in VLSI typically contains a linear array of modified-Booth encoders, and a quadratic array of partial product generators. Reference is now made to FIG. 1 which depicts a typical implementation of an array generally indicated as 1 in a modified-Booth multiplier. Modified-Booth array 1 includes a modified-Booth encoder 10 which receives and encodes the multiplier input information 23 and includes an array of partial product generators generally referred to at 17. Multiplier input information 23 entering modified-Booth encoder 10 includes multiplier bits 21. Multiplicand inputs Y0-Yn are applied to partial product generator array 17. After encoding multiplier input information 23, modified-Booth encoder 10 sends information to the linear array of partial product generators generally indicated as 17.
In a typical implementation of modified-Booth encoder array 1, multiplier bits 21, represented in FIG. 1 as X(j-1,j,j+1), get converted by modified-Booth encoder 10 to form "n" control signals 24, where "n" represents a number between 3 and 5. In a typical prior art design, there are n=3 control signals 24. Reference is now made to FIG. 4 which depicts the truth table for modified-Booth encoder 10 as known in the art. Control signals 24 are represented in FIG. 4 by column headings POS (Positive), TWO and ONE.
Partial product generators 17 decode the information received from modified-Booth encoder 10 and multiplicand inputs 170 to generate the appropriate partial product bits. Partial product generators 17 represent only a part of the entire partial product generator array of the multiplier. Partial product generators 17 as known in the prior art and as depicted in FIG. 1 include a first partial product generator 11, a second partial product generator 12, a third partial product generator 13, a fourth partial product generator 14, a fifth partial product generator 15 and a sixth partial product generator 16. First partial product generator 11, second partial product generator 12 and third partial product generator 13 represent the first three partial product generators to the left of the modified-Booth encoder in the j.sup.th row of modified-Booth encoder. Fourth partial product generator 14, Fifth partial product generator 15 and Sixth partial product generator 16 represent the fourth, fifth and sixth partial product generators to the left of the modified-Booth encoder in the j-2.sup.nd row of modified-Booth encoder. Control signals 24, converted from multiplier bits 21, X(j-1,j,j+1), get applied to the corresponding j.sup.th row of partial product generators 17. Additionally, input Y0 is applied to first partial product generator 11, input Y1 is applied to second partial product generator 12, input Y2 is applied to third partial product generator 13 and fourth partial product generator 14, input Y3 is applied to fifth partial product generator 15 and input Y4 is applied to sixth partial product generator 16.
Subsequently, the output of the partial product generator in the first column of the j.sup.th row (represented as PP(0,j) 28 in FIG. 1) gets added to the output of the partial product generator in the third column of the j-2.sup.nd row (represented by PP(2,j-2) 27 in FIG. 1). Thus, referring again to FIG. 1, the output of first partial product generator 11, Y0 receiving partial product 28, gets added to the output of fourth partial product generator 14, Y2 receiving partial product 27 by the full adder 30. Additionally, the carry in output signal 29 (C.sub.in) of modified-Booth encoder 10 is also used as a carry in to full adder 30 to get a resultant intermediate partial product PP 31 and an intermediate carry out C.sub.out 32. This entire operation is depicted inside box A of FIG. 1.
Reference is now made to FIG. 3 which is a digital circuit diagram representation of second partial product generator 12 as known in the prior art. Input ZO.sub.i is provided by first partial product generator 11 to second partial product generator 12. Output ZI.sub.i is provided by second partial product generator 12 to the next partial product generator in the sequence, in this case, third partial product generator 13.
In the prior art implementation of modified-Booth algorithm, carry in output signal C.sub.in 29 of modified-Booth encoder 10 can be represented by the following equation as known in the art (see also FIG. 4): EQU C.sub.in =X.sub.j+1 (X.sub.j +X.sub.j-1)
Expressing C.sub.in in POS, ONE, TWO terms as illustrated in FIG. 4 results in the following equation: EQU C.sub.in =POS.multidot.(ONE+TWO) (1)
Y0 receiving partial product 28 in an inverted representation can be expressed in the prior art by the following equation: EQU PP.sub.j,0 =((POS.sym.Y0).multidot.ONE)+(POS.multidot.TWO)+(ONE.multidot.TWO) (2)
The output of full adder 30 is a summation of carry in output signal C.sub.in 29, Y0 receiving partial product PP(0,j) 28 and Y2 receiving partial product PP(2,j-2) 27. This summation results in intermediate partial product PP 31 and intermediate carry out C.sub.out 32. Intermediate partial product PP 31 can be represented by the following equation:
PP=P.sub.j,0.sym.PP.sub.j-2,2.sym.C.sub.in (3)
Intermediate carry out C.sub.out 32 can be represented by the following equation: EQU C.sub.out =(PP.sub.j,0.sym.PP.sub.j-2,2).multidot.C.sub.in +(PP.sub.j,0.sym.PP.sub.j-2,2).multidot.PP.sub.j,0 (4)
By combining equation (1) and (2) it can be shown that the intermediate term for generation of intermediate partial product PP 31 is simplified as follows: EQU C.sub.in.sym.PP.sub.j,0 =Y0.multidot.ONE (5)
The critical path through modified-Booth encoder array 1 begins with multiplier bits 21 input into modified-Booth encoder 10, followed by "n" control signals 24 output by modified-Booth encoder 10, followed by Y0 receiving partial product 28 output from first partial product generator 11, followed by full adder 30, followed by the output of full adder 30, the output being in terms of intermediate partial product 31 or intermediate carry out 32. This critical path is the slowest path 20, as depicted in FIG. 1, in the process and hence is the step which must be modified in order to speed up the process. Referring once again to FIG. 1 it takes two gate delay stages to generate control signals 24 from modified-Booth encoder 10, another two gate delay stages to generate Y0 receiving partial product PP(0,j) 28 and four gate delay stages to get intermediate partial product PP 31 or intermediate carry out C.sub.out 32 signals. Thus, it takes a total of eight gate delay stages to travel through slowest path 20. With the constant demand for faster speeds in digital signal processing operations, there is a appreciable need for a faster multiplier that can reduce the delay in critical path 20. Thus, it is desirable to provide for a optimized modified-Booth multiplier array to improve the overall speed of the multiplier.