The present invention relates generally to binary multipliers and more specifically to an improved speed binary multiplier capable of multiplying signed and unsigned operands.
All modern fast binary multipliers utilize some variations of the basic partial product generation technique first applied by Seymour Cray and commonly referred to as "combinational", "paper and pencil", or "flow-through". In its most common form the technique simply involves consecutive multiplications of a K-digit long operand A (multiplicand) by a single digit B(m) of the M-digit long operand B (multiplier) and then shifting the resultant partial product P(m) to the left by the number of places equal to the position of the digit B(m) in the multiplier. In this particular case it is assumed that the number of places the partial product is to be shifted is directly equal to m. The shifting operation is, in fact, equivalent to the multiplication of the multiplicand by the weight of the decimal (or binary) digit B(m).
After generating all M partial products, they are then consecutively summed to yield the final (M+K) digit-long final product of A and B. This technique, used for decimal number multiplication is also directly applicable to the principle of binary multiplication of two numbers A and B, their respective binary widths being K and M. The example of multiplication of such 4-bit operands A=0111=7 and B=0011=3 is given in Table 1.
TABLE 1 ______________________________________ "Paper-and-pencil" multiplication of two 4-bit operands. ______________________________________ ##STR1## ______________________________________
The ability to multiply signed numbers is more difficult. In a two's-complement notation when the most significant bit is a zero it is designated as a positive number, whereas when the most significant bit is a 1 it is designated as a negative number. One way to perform multiplication of two's complement numbers is to convert the negative numbers to their positive binary representation, multiply the positive or unsigned versions and attach the appropriate sign using the law of signs. If both the operands have the same sign, the unsigned product would be the product, since it is positive. If either of the operands were negative, the two's complement negation of the product must be performed.
An alternative to the conversion to an unsigned magnitude and reconversion of the final product is illustrated in Table 2.
TABLE 2 ______________________________________ Multiplication of Two's Complement Operands with Sign Extension. ______________________________________ ##STR2## ______________________________________
The first three partial products are performed with sign extension. The fourth partial product, which is the sign bit, is converted to a two's complement notation before addition with the other partial products. This is to correct for the negative sign bit in combination with the sign extension.
As in apparent from Tables 1 and 2 besides some input and output reformatting of the operands and final product, the bulk of multiplication time, even in its simplest form, is consumed by the M-1 additions required to generate the sum of partial products In fact, all the algorithmic speed improvements brought into the design of parallel multipliers have involved the reduction of the number of additions necessary to generate the final product, as well as acceleration of the necessary additions (application of "Carry-save" adders). The most common techniques used today employ algorithmic refinements of the basic concept described above; they are known as "Wallace Tree Partial Product Reduction"and "Modified Booth Algorithm".
Application of these two techniques combined leads to the potential reduction of the necessary number of partial product additions to one half the number of bits in the multiplier. Consequently, the amount of time necessary for the partial products to flow through the adder array is also cut in half. However, this is accomplished at the expense of using a relatively complex Booth decoder.
Booth algorithms, compared to the present invention, introduces not only extra delays caused by a more complex Booth Decoder, but also results in increased circuit size due to the need of propagating the sign extension through the CSA (Carry Save Adder) array. This also leads to poorer time performance. For example, in Table 1, partial products 1, 2 and 3 would include three, two and one sign extending bits, respectively.
Thus, using the example of Table 1, the Booth multiplication increases generally quadratically with the number of partial products that must be performed, whereas the combinational multiplication of Table 1 varies linearly with the number of bits.
The original Booth algorithm and the modified Booth algorithm involve searching for and determining strings of zeros or ones in the multiplier and performing addition and subtraction for the different partial products depending upon a determination of the beginning, end or middle of the string.
In combinational multiplication, a relative 1-digit shift always occurs between the multiplicand and the partial sum, regardless of whether an addition has occurred or not. Booth's algorithm permits more than one shift at a time, depending on the grouping of ones and zeros in the multiplier bit by bit, starting with the LSB, shifting the partial product relative to the multiplicand as each bit is examined. Subtract the multiplicand from the partial product when you find the first one in a string of ones. Similarly, upon finding the first zero in a string of zeros, add the multiplicand to the partial product. Perform no operation when the bit examined is identical to the previous multiplier bit.
A modified version of Booth's algorithm is more commonly used. The difference between the Booth's and the modified Booth's algorithm is as follows: The modified Booth always generates m/2 independent partial products, whereas the original Booth generates a varying (at most m/2) number of partial products, depending on the bit pattern of the multiplier. Of course, parallel hardware implementation lends itself only to the fixed independent number of partial products. The modified multiplier encoding scheme encodes 2-bit groups and produces five partial products for an 8-bit multiplier, the fifth partial product being a consequence of the fact that the algorithm only handles two's complement numbers.
The most conventional modified Booth scheme is to consider a bit-pair in each step, i.e., bit-pair recoding. The multiplier bits are divided into 2-bit pairs, and 3 bits (a triplet) are scanned at a time, two-bits form the present pair and the third bit (the overlap bit) from the high-order bit of the adjacent lower-order pair. After examining each bit-pair, the algorithm converts them into a set of 5 signed digits 0, +1, +2, -1 and -2. According to the Boolean truth table shown in Table 3, each recoded digit performs only a simplified processing on the multiplicand, such as add, subtract, or shift.
TABLE 3 ______________________________________ Truth table for the modified Booth algorithm with bit-pair recoding Multiplier bit triplet The recorded 2.sup.1 2.sup.0 2.sup.-1 operand b.sub.m+1 b.sub.m b.sub.m-1 b.sub.m Remark ______________________________________ 0 0 0 0 no string 0 0 1 1 end of string 0 1 0 1 isolated 1 0 1 1 2 end of string 1 0 0 -2 begin of string 1 0 1 -1 end/begin of string 1 1 0 -1 begin of string 1 1 1 0 center of string ______________________________________
The application of the modified Booth algorithm to the example of Table 1 is shown in Table 4. As expected, the final product is the same.
TABLE 4 ______________________________________ Multiplication Using Modified Booth ______________________________________ ##STR3## ______________________________________
State-of-the-art multipliers, such as those employed in DSP (Digital Signal Processing) architectures, should also be capable of performing accumulation of the products, as well as be capable of operating on both unsigned integers and two's complemented binary words.
Thus it is an object of the present invention to provide a recoding scheme which is an improvement over the modified Booth algorithm.
Another object of the present invention is to provide an recoding scheme which is capable of handling signed and unsigned numbers without substantial pre-conditioning.
A still further object of the present invention is to provide a sign extension requiring less hardware.
A still further object of the present invention is to reduce the size of the adder array by selective pre-addition.
A still further object of the present invention is to provide a multiplier/accumulator with product sign extension.
Still an even further object of the present invention is to provide improved mechanisms for taking two's complements of multiplicands and multipliers as well as converting sign magnitude numbers to two's complements.
An even further object of the present invention is to provide an improved complex multiplier requiring fewer registers and multiplexers.
These and other objects are achieved by a recoding system wherein for two-bit pairs. The set of signed digits is reduced from five to four which includes zero. Recoding scheme by special recoding and control of Carry of the most significant bit allows the recoding scheme to accommodate negative two's complement multipliers. The recoding scheme can operate in two-bit, three-bit, four-bit groups etc. For the three-bit, only two additional signed digits are used whereas for a four-bit recoding scheme four additional sign bits are used. The Carryout of the three-bit and four-bit recoding scheme is independent of Carryin.
The sign extension of the partial products is improved by using a single sign extension word for all the sign extensions. The sign extension word SEW is formed as a plurality of negative bits (1) beginning with the sign bit of the first negative partial product and extending the length of the multiplier, except for a positive sign bit (0) for a sign bit of subsequent negative partial products substituted for the corresponding negative bit of the sign extension word SEW. The single sign word SEW is produced by determining and collecting sign bits SE of the partial products as a sign word and two's complementing the sign word to produce the sign extension word SEW.
The number of partial products using the recoded multiplier requires an additional or carry partial product for the Carryout of most significant recoded group. If this extra partial product is negative, the complementing includes adding a complementing Carry. Also as previously discussed, the sign extension word is a complemented operation and therefore a complementing Carry must be added to it also. To reduce the size of the array, the present scheme determines the position of the complementing Carry for the sign extension word and the occurrence of a complementing Carry for the additional partial product and pre-adds these two Carries to the multiplicand of one of the partial products prior to the array. This value is then held until needed. This operation is performed in parallel with the multiplier recording means.
In multiplier/accumulators, the output of the adder array is a sum S and a Carry C of N bits and the sign must be extended to the capacity of the accumulator register. The present multiplier/accumulator produces a product sign extension word PSEW as a function of the multiplicand and multiplier to extend the sum S and the Carry C to the length of the accumulator. The product sign extension word PSEW is produced in parallel with the partial products and the adder array. A final adder adds the sum S the Carry C, the product sign extension word PSEW and the most significant bits of the accumulator register. The final adder includes merging capability for merging the sum S, the Carry C, the sign produce extension word and the most significant bits into two merge words and a simple adder for adding the two merge words. The product sign extension word PSEW is uniquely selected between two possible alternatives so as to merge with the sum S, Carry C and the most significant bits of the accumulator register. The least significant bit of the product sign extension word PSEW can have a one or zero.
The formation of a two's complement may be by adding a one to the one's complement as follows: ##EQU1## or by adding the complement of the first bit to the one's complement of the remaining bits with the first bit uncomplemented as follows: ##EQU2## By providing the appropriate selection of which of the two complementing methods being used, the complement Carry being either the one or A.sub.0 will be provided in the same place in the adder array. The selectivity of which method depends upon the position of the partial product produced by the recoded multiplier. If a recoded multiplier is two, the first method is used and if the recoded multiplier is one, the second method is used.
The ability to select the method of complementing allows the present system to handle signed magnitude numbers without any substantial processing. For a negative sign magnitude number, the absolute value is stored in the multiplicand register as a one's complement. Whether the Q or Q output of the register is selected is a function of a negative or positive partial product and will only require providing a appropriate complementing Carry. This is the only processing needed. For negative sign magnitude multiplier, the one's complement of the absolute value of the multiplier can be provided in the multiplier register and a one is added to the least significant bit during the recoding process.
The multiplier includes a complementor performing two's complements of the multiplicand by loading the multiplicand or the one's complement of the multiplicand into the multiplicand, register depending upon the input format of the multiplicand multiplier and output format of the product. The complementor forms the two's complement by adding a complementing carry if the multiplicand was loaded as a one's complement in the multiplicand register and the recoded multiplier group is positive. The complementor also forms two's complement by adding a complementing carry if the multiplicand was loaded uncomplemented into the multiplicand register and the recoded multiplier group is negative. The storing of the one's complement of the negative multiplicand into the multiplicand register allows this reversal to take place.
The ability to perform the two's complement of a number for a positive or negative partial products reduces the hardware needed in a complex multiplier for multiplying complex numbers (A+jB) and (C+jB). The structure requires four registers with four partial product multiplexers with preadders and a pair of adder arrays.
Other objects, advantages and novel features of the present invention will become apparent from the following detailed description of the invention when considered in conjunction with the accompanying drawings.