A widely used scheme for twos-complement multiplication uses Booth's algorithm which allows n-bit multiplication to be done using fewer than n additions or subtractions, thereby allowing faster multiplication. FIG. 1 illustrates an example of a traditional approach to multiplication using Booth's algorithm. The multiplication circuit of FIG. 1 performs one 64-bit multiplication. The multiplication circuit of FIG. 1 is implemented as a radix-4 Booth multiplication scheme (as an example). The multiplication circuit 10 of FIG. 1 includes a Booth encoding circuit 12 for encoding the 64-bit multiplier. A partial product selector circuit 14 is coupled to the Booth encoding circuit 12 and includes 32 partial product selector multiplexers for selecting one of several multiples (0, 1, 2, etc.) of the multiplicand. Each of the 32 partial product selector mux's outputs one of the 64-bit partial products (a multiple of the multiplicand) based on the signals from the Booth encode cells. The 32 partial products are added together using a Wallace Tree 16 followed by a carry look-ahead adder (CLA) 18. The Wallace tree is a tree-like network of carry save adders (CSAs). FIG. 2 illustrates one Booth encoder 22 and one partial product selector (mux) 24 for multiplication circuit 10 of FIG. 1. In this example, there are 32 Booth encoders 22 and 32 mux's 24. For radix-4 Booth encoding, the Booth encoder 22 Booth encodes two bits aij (plus an extra adjacent bit) of the multiplier (A) to generate a Booth encoded value 23 that is input to the mux 24. The Booth encoded value 23 identifies a multiple, such as 1, -1, 2, -2 or 0. Based on the Booth encoded value 23, mux 24 selects the corresponding multiple of the multiplicand (B) as the partial product.
Therefore, multiplication circuit 10 of FIG. 1 multiplies a 64-bit multiplicand (Gg) and a 64-bit multiplier (Hh) (where each of g, G, h and H represents a 32-bit quantity). This multiplication operation generates the following partial products: ##STR1##
However, instead of performing a single 64-bit multiplication, it may be desirable to perform two separate 32-bit multiplications in parallel using the same 64-bit multiplication circuit 10 of FIG. 1, as follows: g*h, and G*H. The products of the two multiplications should be (gh) and (GH). However, as shown above, the multiplication circuit of FIG. 1 will generate two unwanted cross products (hG) and (Hg) when attempting to perform two separate multiplication operations at the same time using the 64-bit multiplication circuit 10 of FIG. 1. The problem becomes worse when using the same 64-bit multiplication circuit 10 to perform four 16-bit multiplications or eight 8-bit multiplications.
As a result, using the 64-bit multiplication circuit 10 to perform two 32-bit multiplications requires performing each operation separately (i.e., two passes through the multiplication circuit), and four 16-bit multiplications will require four separate operations (i.e., four passes through the multiplication circuit). Thus, using the 64-bit multiplication circuit 10 to perform multiplication operations on operands smaller than 64-bits is slow and very inefficient. Therefore, a need exists for a single size multiplication circuit that can more efficiently handle operands of variable width.