1. Field of the Invention
The present invention relates to the field of math processors in computers, and more particularly, to Booth multipliers used in math processors to perform high speed multiplication of numbers.
2. Description of Related Art
One of the primary functions of most computer systems is to perform a large number of mathematical operations at a speed much faster than a human being could perform the operations. Since a computer devotes a considerable amount of its processing time to performing mathematical operations, an improvement in the speed of a math processor of the computer for performing a particular type of operation will increase the overall speed of the computer.
A known method of performing multiplication in a math processor is by array multiplication using a parallel multiplier. The parallel multiplication process is based on the fact that partial products in multiplication can be independently computed in parallel. An example of multiplication by partial products is shown below in Table 1 for two 4-bit numbers.
TABLE 1 __________________________________________________________________________ 4-bit Multiplier Partial Products __________________________________________________________________________ X3 X2 X1 X0 Multiplicand Y3 Y2 Y1 Y0 Multiplier X3Y0 X2Y0 X1Y0 X0Y0 X3Y1 X2Y1 X1Y1 X0Y1 X3Y2 X2Y2 X1Y2 X0Y2 X3Y3 X2Y3 X1Y3 X0Y3 P7 P6 P5 P4 P3 P2 P1 P0 Product __________________________________________________________________________
A parallel multiplier is normally implemented as a square array of adders. In what is known as a Radix-2 scheme, the partial products are computed by observing one bit of the multiplier at a time. A higher radix multiplier, such as a Radix-4 multiplier, or a "Booth recoding multiplier", reduces the number of adders (and therefore the delay required to produce the partial sums) by examining a plurality of bits at a time. In conventional Booth recoding, the multiplier bits are divided into two-bit pairs, and a total of three bits are scanned at a time. These three bits are: the two bits from the present pair; and a third bit from the high order bit of an adjacent lower-order pair. After examining each triplet of bits, Booth recoding logic converts the triplet into a set of five control signals used by the adder cells in the array to control the operations performed by the adder cells.
In a conventional 16.times.16 Booth multiplier, such as that shown in the prior art multiplier of FIG. 1, the array comprises eight rows (or "stages") of adder cells. Only eight stages are needed in the array since a plurality of bits of the multiplier are examined in each stage.
The high performance of the Booth multiplier does not come without cost, however, in the form of relatively high power consumption. This is due in part to the large number of adder cells (15 cells for 8 rows=120 core cells) that consume power. Each of the adder cells normally includes a 5-input multiplexer controlled by the five control signals generated by the Booth recoding logic. In Booth multipliers that use conventional Booth recoding logic to generate the control signals, short-circuit paths can be created by one of the control signals turning off after another signal has turned on to select one of the inputs. These temporary short-circuit paths dissipate power and increase the power consumption of a Booth multiplier.
Another large consumer of power in Booth multiplier arrangements is the input bus that provides one of the numbers (the multiplicand) to each of the eight stages (or rows) in the array. Using the same input to drive all eight stages means that there is a very large load on the multiplexers in the first stage of the array. Due to this load, the input to the array must be buffered. However, in order to provide high-speed multiplier performance, the first stage needs to receive the input with very little delay. Prior Booth multipliers therefore provided high-speed buffers to the first input stage, but these buffers consumed a sizable amount of power.