Multipliers are often essential elements in data processing systems. However, as technology has grown more complex and users have demanded that central processing units in such data processing systems perform increasingly faster functions, the size of multiplier circuits has grown and is often a significant portion of the circuit area of the central processing unit. For example, to simply meet an IEEE-754 Floating Point specification, a multiplier must be able to multiply two 53-bit inputs. A significant amount of circuit area is required to implement a multiplier which has such large inputs due to an amount logic required to implement a multiplication operation.
To improve the performance of such a multiplier, several techniques have been developed. Generally, multipliers with the desired capability have array structures or a Wallace tree structure. It should be noted that multipliers with greater area allotments and greater performance requirements often employ a Wallace tree structure. In an array multiplier or a Wallace tree implementation, a modified Booth's algorithm can be used to produce n/2 rows of m partial products in an initial step, where n is a number of bits of the multiplier operand and m is the number of bits of the multiplicand input to the multiplier. Booth's algorithm is well-known in the data processing art and was disclosed in a paper entitled "A signed Binary Multiplication Technique," published in Q.J. Mech. Appl. Math. 4:236-240 (1951), and published by Oxford University Press.
A modified Booth's algorithm allows two's-complement multiplication. To multiply A.multidot.B using Booth's algorithm, each of the bits of the multiplier A are examined in groups of three adjacent bits starting with a least significant bit. The following table illustrates an encoded relationship among these three bits.
______________________________________ Add to Partial X.sub.i+2 X.sub.i+1 X.sub.1 Product ______________________________________ 000 +0Y 001 +1Y 010 +1Y 011 +2Y 100 -2Y 101 -1Y 110 -1Y 111 -0Y ______________________________________
Notice, the required multiple of the multiplicand may be easily implemented by shift and invert operations. After the encoding operation is performed, n/2 rows of partial products are added. In comparison, n rows of partial products are required by a non-Booth-recoded methodology. Thus, modified Booth encoding conserves one level of addition and reduces the area required to perform a multiplication operation in a Wallace tree multiplier. Furthermore, modified Booth encoding reduces the number of levels of addition to perform a multiplication operation in an array multiplier to n/2. Booth's algorithm is easily implemented as all multiplication operations may be implemented as simple arithmetic left shifts.
Typically, multipliers use a modified Booth's algorithm to encode data which is subsequently summed using either an array summation scheme or a Wallace tree summation scheme. When a Wallace tree scheme is utilized, the encoded information provided from Booth's algorithm can be compressed using a compressor comprised of counters. A carry save adder (CSA) is an example of 3:2 counter. Typically, the compressor is utilized to provide greater regularity to simplify a layout associated with the multiplier. For example, as previously discussed, when a floating point operation is executed and the IEEE-754 Floating Point specification must be satisfied, the multiplier must multiply two 53-bit inputs. When one of the inputs to the multiplier is Booth encoded, only 27 rows of partial products remain to be reduced by an array or tree of counters. Because 27 rows are required, typical implementations of multipliers utilize three 9:2 compressors whose outputs are then reduced by a single 6:2 compressor.
In implementing such compressors, several difficulties arise. For example, compressor implementations tend to be wire bound in the context of a multiplier and are very difficult to route on a surface of a semiconductor device. Additionally, given the structure of Wallace tree multipliers, irregular edges often result from the wiring restraints of the compressor. Such irregular edges waste valuable circuit area and increase an overhead associated with implementation of the semiconductor device.
FIG. 1 illustrates a traditional implementation of a compressor typically used to implement a Wallace tree multiplier. Compressor 10 of FIG. 1 comprises a plurality of full adders 12 through 24. Each of a first row of full adders (12 through 16) receives inputs having a same weight, where a weight corresponds to a bit position in a binary number system. Furthermore, it should be noted that all inputs to compressor 10 are a same weight. In FIG. 1, this weighting is indicated in a subscript on each of the inputs. For example, x.sub.1 indicates that a bit having a weight of 1 is input by that signal. Furthermore, it should be noted that in traditional implementations of compressors, such as compressor 10, all of the inputs are a same weight and all of the outputs of a bit slice of the compressor, with the exception of a final sum bit which has the same weight as the inputs, are a next higher weight. Therefore, in FIG. 1, each of the outputs of the bit slice of a compressor 10 have a subscript "2" (with the exception of Sum (1)) to indicate that a next higher weight is being assigned to each of the outputs of the bit slice. Please note that the subscripts provided in FIG. 1 are provided to show relative weights and do not indicate that the inputs are in a first bit position and the outputs are in a second bit position. U.S. Pat. Nos. 5,181,185 and 5,343,416 provide illustrations of such traditional implementations of a compressor in a multiplier.
While the use of Booth recoding significantly reduces an amount of circuitry required to implement a multiplier and data processor, the problems and difficulties associated with that implementation described above still remain. Therefore, a need exists for a compressor implementation which minimizes an amount of required global wiring and, therefore, reduces a compressor's tendency to be wire bound in some applications. Additionally, in light of the circuit area requirements traditionally required by multipliers with Wallace tree schemes, and therefore compressors, there is a need for a compressor which makes more efficient use of the circuit area required to implement the Wallace tree multiplier.