Hardware implementations of numeric multiplication are often done in a manner similar to the way multiplication is done on paper. The basic process of these types of implementations of numeric multiplication can be broken down into three steps: (1) partial product generation; (2) partial product reduction; and (3) final addition. Texts on computer arithmetic which deal with multiplication commonly cover the general theory. One such book which presents the topic is "Introduction to Arithmetic for Digital Systems" by Waser and Flynn, and published by Holt, Rinehart and Winston. Applicants hereby refer to, and incorporate by reference the contents of "Introduction to Arithmetic for Digital Systems."
In order to define general terms and to explain the methodology of the above-mentioned three multiplication steps, a simple example of hand multiplication will be described. Turning to FIG. 1, a multiplicand "3" is multiplied with a multiplier "2." The binary representations of the two numbers are "11" and "10," respectively.
Each of the multiplicand and the multiplier are two bits in length. The multiplicand and the multiplier each have a least significant bit position and a most significant bit position. The least significant bit positions of the multiplicand and the multiplier form a least significant bit column, denoted in FIG. 1 at 11. The least significant bit column includes the rightmost "1" of the multiplicand and the "0" from the multiplier. The most significant bit positions of the multiplicand and the multiplier form a most significant bit column, denoted in FIG. 1 at 13. The most significant bit column is one order of significance up from the least significant bit column, and is hence denoted as the next-higher significant bit column.
Generally speaking, increasing higher significant bit columns ("higher columns") will be referred to as the first higher column, second higher column, etc., relative to a given column; and decreasing lower significant bit columns ("lower columns") will be referred to as the first lower column, second lower column, etc., relative to a given column. For example, relative to column 11 in FIG. 1, column 13 is the first higher column, and column 15 is the second higher column. Similarly, relative to column 15, column 13 is the first lower column, and column 11 is the second lower column. As a final definition, each column is defined as representing a level of significance, such that if each bit of a number is shifted to a first higher column, then the entire number is shifted left one level of significance.
With these definitions, the above-mentioned numeric multiplication process will be applied to the multiplicand and multiplier of FIG. 1. First, the "0" of the multiplier is multiplied with each bit of the multiplicand "11, " yielding the first partial product "00". The second partial product is generated in a similar way by multiplying the "1" of the multiplier with the "11" of the multiplicand. Every bit of the resulting partial product "11" is shifted to a first higher column, relative to the columns of the first partial product. That is, the second partial product is shifted left one level of significance. The two partial products are then added to get the final sum of "110."
Turning now to FIG. 2, it is noted that, although the multiplication of FIG. 1 yielded a result three bits in length, multiplication of an n-bit number and an m-bit number generally produces a result up to n+m bits in length. For example, multiplication of a multiplicand of "11" and a multiplier of "11" yields a first partial product and a second partial product as shown at 17 and 19 in FIG. 2. The second partial product 19 is shifted left one level of significance. The sum of the two partial products is "1001, " which is four bits in length. This sum is shown at 21 in FIG. 2.
Rising to a conceptual level, FIG. 3 shows the general bit positions of a multiplicand 23 and a multiplier 25. The multiplicand 23 and the multiplier 25 are each four bits in length (i.e., m=n=4). Each of the four bit positions is represented by a small circle ("dot"). Looking at FIG. 3, we see that each of the four partial products 27, 29, 31, and 33 is consecutively shifted left one level of significance. The length of the final sum 35 is eight bits, which is the sum of m and n.
When the same multiply operation is performed in 2's complement arithmetic, the general bit positions take the form shown in FIG. 4. As is well known in the art, 2's complement arithmetic allows binary numbers to represent positive and negative values. The 2's complement of a binary number may be found by changing all of the "1"s to "0"s and changing all of the "0"s to "1"s. A "1" is then added to the result. Following this procedure, the 2''s complement of "0010" is found by changing the number to "1101" and then adding "1" to yield "1110." It is noted that the two leftmost zeros of "0010" became "1"s in the 2's complement representation.
Continuing with this logic, the leftmost bit positions of the partial products of FIG. 3 are not extended, but the leftmost bit positions of the partial products of the 2's complement partial products of FIG. 4 are extended. The shaded dots represent extended bit positions which are assigned binary "1" values. One such bit position is shown at 37 in FIG. 4.
As the number of bits in the operands increases, so does the number of partial products. Since speed is one of the major factors in multiplier implementations, there is a problem in summing the partial products. This is a problem of both software and hardware. For example, in a multiplier for multiplying two sixty-four bit operands, sixty-four partial products must be summed. There are several schemes for reducing the number of partial products, two of which will be discussed below.
A technique called Booths decoding is commonly used in the prior art to reduce the number of partial products by a factor of two or more. Even with a minimization scheme such as Booths decoding, however, the problem remains of quickly adding the remaining partial products with a minimum amount of circuitry.
A second approach in the prior art, which may be used in conjunction with the first approach, is the implementation of Carry-Save-Adders (CSAs), which are similar to full adders. A CSA is similar to a full adder in that it inputs three numbers and outputs two numbers. A tree of CSAs can be used to reduce a number of partial products to two numbers which can then be summed by a standard Carry-Propagate Adder. The prior art is full of various CSA schemes, perhaps in consequence of the complexity and costs of building these structures.