1. Field of the Invention
The present invention relates to computer arithmetic, and more particularly to parallel multipliers.
2. Description of Related Art
Parallel multipliers are widely used in systems such as arithmetic logic units (ALUs) of high-performance computers. Parallel multipliers are often configured with a parallelogram arrangement of logic cells. A multiplicand and multiplier are applied to rows of the logic cells, and partial products are summed to provide the final product.
The modified Booth's algorithm (or bit-pair recoding) examines multiplier bit-pairs Y.sub.i+1 and Y.sub.i along with previous multiplier bits Y.sub.i-1 and recodes the bit-pairs as a single value. Although the recoded bit-pairs specify one of five different operations on the multiplicand, the partial products are based on the recoded bit-pairs rather than individual bits of the multiplier. As a result, the number of partial products is reduced by one-half, which provides for faster computation. The modified Booth's algorithm is summarized below in Table 1.
TABLE 1 ______________________________________ MODIFIED BOOTH'S ALGORITHM Multiplier Multiplier Bit Bit-Pair on Right Operation on Y.sub.i+1 Y.sub.i Y.sub.i-1 Multiplicand (.times.) ______________________________________ 0 0 0 0 .times. X 0 0 1 +1 .times. X 0 1 0 +1 .times. X 0 1 1 +2 .times. X 1 0 0 -2 .times. X 1 0 1 -1 .times. X 1 1 0 -1 .times. X 1 1 1 0 .times. X ______________________________________
A Wallace tree provides even faster computation by summing groups of partial products in parallel, but an irregular wiring structure is required. Therefore, a Wallace tree is desirable when performance is the main issue, but the increased wiring complexity is a drawback.
In parallel multipliers using a parallelogram arrangement or Wallace tree, the final step is usually adding the final sum bits and carry-out bits using a carry propagate adder such as a carry lookahead adder.
Signed binary numbers are usually represented in computers using four systems: sign-magnitude, 2's complement, 1's complement, and biased. Of these systems, 2's complement is the most popular due to the ease of implementing addition and subtraction.
In parallel multipliers for 2's complement numbers, the sign-bit of each partial product is sign-extended to the left edge of the eventual product. Sign extension can be provided by any suitable hardware or software. For instance, positive numbers can be sign-extended by propagating zero's to the left end, and negative numbers can be sign-extended by propagating one's to the left end. The sign generate method for sign extension is especially useful since the sign bit is generated statically and need not propagate to the left edge of the eventual product. The sign generate method includes (1) complementing the sign bit of each partial product, (2) adding a one to the left of the sign bit position of each partial product, and (3) adding a one to the sign bit position of the first partial product.
Further details regarding the sign generate method are found in "Digital CMOS Circuit Design" by M. Annaratone, published by Kluwer Academic Publishers, 1986, pp. 211-229, which is incorporated by reference.
FIG. 1 is a schematic view showing parallel multiplication of a single multiplicand and a single multiplier using the modified Booth's algorithm and the sign generate method in accordance with the prior art. In this case, a 32-bit multiplicand X is multiplied by a 32-bit multiplier Y to provide a 64-bit product, and the numbers are represented in 2's complement form. Since the modified Booth's algorithm is used, 16 partial products PP1-PP16 are generated, each partial product occupies 33 bit positions, and each successive partial product is left shifted by two bit positions with respect to the previous partial product.
Each partial product is set to either zero, the multiplicand (with the sign bit propagated left by one bit position), the complement of the multiplicand (with the sign bit propagated left by one bit position), the multiplicand left shifted by one bit position (with the LSB filled with a zero), or the complement of the multiplicand left shifted by one bit position (with the LSB filled with a one) in accordance with the modified Booth's algorithm. In this manner, each partial product contains an "adjusted multiplicand." Furthermore, since negating a 2's complement number requires complementing the number and incrementing by one, an increment bit (I) is added to the LSB of each partial product in accordance with the modified Booth's algorithm. For each partial product, the associated increment bit (shown directly beneath the LSB) is set to a one when the complement of the multiplicand or the complement of the multiplicand left shifted by one bit position is selected, and is otherwise set to zero. Therefore, when the complement of the multiplicand is left shifted by one bit position, the combination of the one filled into the LSB position and the increment bit set to one at the LSB position is equivalent to adding a one to the bit position adjacent to the LSB position and filling the LSB position with a zero.
Since the sign generate method is used, the sign bit of each partial product is complemented (S), a one is added to the immediate left of each sign bit position, and a one is added to the sign bit position of the first partial product.
For instance, partial product PP1 occupies bit positions 0-32, receives selection signals generated by multiplier bit-pair Y.sub.1 and Y.sub.0 and a previous bit of zero using the modified Booth's algorithm, has an increment bit at bit position 0 set by multiplier bit-pair Y.sub.1 and Y.sub.0 and a previous bit of zero using the modified Booth's algorithm, and has a one added to bit position 33 which is adjacent to the complemented sign bit at bit position 32.
Thereafter, the partial products and the extra one's (from the sign generate method and the increment bits) are added to provide the final product. The partial products can be sequentially added using carry save adders configured as a parallelogram that adds partial products PP1 and PP2, adds the sum of partial products PP1 and PP2 to partial product PP3, etc. Alternatively, the partial products can be added using carry save adders configured as a Wallace tree that adds partial products PP1-PP4 to provide a first intermediate product, partial products PP5-PP8 to provide a second intermediate product, partial products PP9-PP12 to provide a third intermediate product, partial products PP13-PP16 to provide a fourth intermediate product, and adds the four intermediate products to provide the sum of the partial products. The sum of the partial products and the extra one's are added in a final row of carry save adders, and the output of the final row of carry save adders is applied to a carry propagate adder such as a carry lookahead adder that generates the final product.
Data processing applications often require multiplying numbers with different bit lengths such as 32 bits (word), 16 bits (half word) and 8 bits (byte). A 32.times.32 bit parallel multiplier can accommodate 16 and 8 bit numbers merely by sign extending the numbers into 32 bit words. However, with this approach, the parallel multiplier generates only a single product, regardless of the bit length of the numbers.
Accordingly, a need exists for an improved parallel multiplier that accommodates multiple numbers with different bits lengths in a more efficient manner.