1. Field of the Invention
The present invention relates to hardware implemented multipliers for performing multiplication of two numbers in binary representation.
2. Description of the Background Art
In the field of image processing which deals with a large amount of image data or in the field of information processing which utilizes computers or CPUs (Central Processing Units), multiplication of data is one important processing. For example, DCT (Discrete Cosine Transformation), digital filter processing, matrix operation and the like are processings requiring multiplication.
In such fields of art, data is usually represented in binary number. Therefore in multiplication, data in binary representation will be multiplied.
FIG. 21 is a representation showing one example of multiplication of 4 bit binary numbers. In FIG. 21, a binary number "1010" is multiplied by a binary number "0101". The multiplication is simply performed in the same manner as a multiplication of decimal numbers. In binary representation, different from decimal numbers, each digits has a weight of 2, such as 2.sup.0, 2.sup.1, 2.sup.2 . . . . In FIG. 21, the intermediate sum surrounded by the dotted line is called partial product. The result of multiplication is given by adding up the partial products for each digit.
FIG. 22 is a diagram showing the structure of a conventional parallel multiplier for multiplying 4 bit binary numbers. In FIG. 22, the multiplier includes a register circuit la for holding a multiplicand X, and a register circuit 1b for holding a multiplier Y. The multiplicand X and the multiplier Y each are 4 bit data and include bits x4, x3, x2, x1 and bits y4, y3, y2, y1, respectively. The bit x4 and bit y4 are the most significant bits of the data X and Y, respectively, while the bit x1 and bit y1 are the least significant bits of the data X and Y, respectively. The data bits held at register circuit 1a are transferred on a multiplicand data line 2a, while data bits held at register circuit 1b are transferred on a multiplier data line 2b. The multiplicand data line 2a includes a data line 2a4 for transferring the data bit x4, a data line 2a3 for the data bit x3, a data line 2a2 for the data bit x2, and a data line 2a1 for the data bit x1.
The multiplier data line 2b includes a data line 2b1 for transferring the data bit y1, a data line 262 for the data bit y2, a data line 263 for the data bit y3, and a data line 264 for the data bit y4.
The multiplier further includes AND circuits AN11-AN44 arranged correspondingly to the cross over points of the multiplicand data line 2a and the multiplier data line 2b. In FIG. 22, AND circuits arranged in the horizontal direction produce one partial product. More specifically, AND circuits AN11-AN14 produce the product of the data bit y1 and the multiplicand X. AND circuits AN21-AN24 give the product of the data bit y2 and the multiplicand data X. AND circuits AN31-AN34 give the product of the data bit y3 and the multiplicand X, and AND circuits AN41-AN44 give the product of the data bit y4 and the multiplicand X.
In order to produce the final product X.multidot.Y by adding up the partial products produced by AND circuits AN11-AN44, adding circuits AD11-AD43 are provided. Adding circuits AD11, AD12, and AD13 are half adders which receive respective outputs of AND circuits AN21-AN23 at their one inputs A, add up data bits applied to their inputs A and B to output sum data bits from outputs S and carry signals from their carry out outputs CO to adding circuits in the second stage. Adding circuits for receiving the outputs of AND circuits AN31-AN44 at their inputs A are full adders. Each full adder adds data bits applied to its inputs A and B and an input applied to its carry input CI and outputs a sum data bit from its output S and a carry out signal from its carry out CO.
Adding circuits AD11-AD13 in the first stage receive the outputs of corresponding AND circuits AN21-AN23 at their one inputs A, and the outputs of AND circuits AN12-AN14 which produce the partial product in the preceding stage at their the other inputs B. The carry outputs of adding circuits AD11-AD13 are respectively provided to the carry inputs CI of adding circuits AD21-AD23 at 1 bit higher digit in the next stage. Adding circuits AD21 and AD22 receive the outputs of corresponding AND circuits AN31 and AN32 at their one inputs A, and the outputs of corresponding adding circuits AD21 and AD23 at their the other inputs B. Adding circuit AD23 receives the output of AND circuit AN33 at its one input A and the output of AND circuit AN24 at the other input B. The carry outputs of adding circuits AD21-AD23 are provided to the carry inputs CI of adding circuits AD31-AD33 at 1 bit higher digit in the next stage.
Adding circuits AD31-AD33 receive the outputs of AND circuits AN41-AN43 at their one inputs A. Adding circuits AD31 and AD32 receive the addition outputs (S) of corresponding adding circuits AD22 and AD23 at their the other inputs B. Adding circuit AD33 receives the output of AND circuit AN34 at its the other input B.
Adding circuit AD41 which produces a final output is a half adder and receives the carry output of adding circuit AD31 at the other input B and the sum output S of adding circuit AD32 at its one input A. The carry output of adding circuit AD41 is applied to the carry input CI of an adjacent adding circuit AD42. Adding circuit AD42 receives the sum output S of adding circuit AD33 at its one input A and the carry output of adding circuit AD32 at the other input B. The carry output of adding circuit AD42 is provided to the carry input CI of an adjacent adding circuit AD43. Adding circuit AD43 receives the output of AND circuit AN44 and the carry output of adding circuit AD33.
In the structure illustrated in FIG. 22, a block surrounding by AND circuits AN11-AN44 and adding circuits AD12, AD13, AD22, AD23 and AD32, in other words a block 5 defined by the dotted line will be referred to as an adder array. A block formed of adders AD11, AD21, AD31, AD41, AD42 and AD43 which output final multiplication result, in other words a block 10 defined by the dotted line will be referred to as a final adder chain.
In the structure of the multiplier illustrated in FIG. 22, AND circuits produce partial products, addition of the partial products is performed in adding circuits, and the operation shown by way of example in FIG. 21 is performed in the multiplier.
More specifically, the multiplier first produces partial products utilizing the AND circuits, and then performs multiplication operation by adding up the partial products utilizing the adding circuits. In other words, a 8 bit product Z is produced from the 4 bit multiplier Y and the 4 bit multiplicand X.
FIG. 23A is a representation showing one example of the structure of the half adder shown in FIG. 22. In FIG. 23A, the half adder includes an AND circuit 48 for receiving data bits provided to its inputs A and B through signal lines 43 and 44, and an ExOR circuit 49 for producing an exclusive logical sum of the data bits on signal lines 43 and 44. A carry output CO is output from AND circuit 48, and a sum output S is produced from ExOR circuit 49. The half adder, as illustrated in FIG. 23B, produces a carry output CO of "1" on signal line 46 when the data bits provided to inputs A and B both are "1". ExOR circuit 49 serves as a non-coincidence detector and produces the output S of "1" on a signal line 47 when the logics of data bits provided to inputs A and B are not coincident.
FIG. 24A illustrates one example of the structure of the full adder shown in FIG. 22. In FIG. 24A, the full adder includes an inverter circuit IV1 for inverting an input data bit B provided through signal line 54, an inverter circuit IV2 for inverting an input data bit A on signal line 53, a transmission circuit Tr2 for passing the output of inverter circuit IV1 in response to the output of inverter circuit IV2, a transmission circuit Tr1 for passing the input data bit B on signal line 54 in response to the input data bit A on the signal line 53. Transmission circuits Tr1 and Tr2 each are provided in parallel and transmit a signal to a node ND when a signal applied to the gate is in an "H" level.
The full adder further includes a transmission circuit Tr3 for transmitting the input data bit A on the signal lines 53 in response to a potential on a node ND, an inverter circuit IV3 for inverting the signal potential of node ND, and a transmission circuit Tr4 for transmitting a carry input CI provided on a signal line 55. Transmission circuits Tr3 and Tr4 conduct in a complementary manner and produce a sum S on a signal line 57.
The full adder further includes a transmission circuit Tr5 for passing the signal (carry input) CI on signal line 55 in response to the output of inverter circuit IV3, an inverter circuit IV4 for inverting the carry input CI on signal line 55, a transmission circuit Tr6 for passing the output of inverter circuit IV4 in response to a signal potential on node ND, and an inverter circuit IV5 for inverting one of the outputs of transmission circuit Tr5 and transmission circuit Tr6, thereby producing a carry output CO on signal line 56. Transmission circuits Tr5 and Tr6 conduct in a complementary manner to each other. Transmission circuits Tr1-Tr6 each conduct when a signal of an "H" level (a signal of logical "1") is applied to the gate.
FIG. 24B sets forth in a table the inputs/outputs of the full adder shown in FIG. 24A. The full adder shown in FIG. 24A produces 2 bit outputs S and CO by adding up 3 bit inputs A, B, and CI. The carry output CO is a more significant bit. Assume that A, B and CI are all in the state of "1". In this condition, the bit B is transmitted to node ND through transmission circuit Tr1. Transmission circuit Tr6 conducts based on the bit B of "1" transmitted to node ND. Inverter circuit IV4 inverts the bit CI of "1" on signal lines 55. Accordingly, a signal of "1" is output on signal line 56 from inverter circuit IV5.
Meanwhile, transmission circuit Tr3 conducts based on the signal of "1" on node ND, and the bit A on signal line 53 is transferred onto signal line 57. Thus, the bits CO and S both attain the "1" level.
When the bits A, B, and CI are all in the "0" level, transmission circuit Tr2 conducts, and outputs a signal of "1" to node ND (the effect of inverter circuit IV1). Transmission circuit Tr3 conducts in response to the signal of "1" on node ND, and the bit A of "0" on signal line 53 is transmitted onto signal line 57. Thus, the bit S attains the "0" level.
Meanwhile, transmission circuit Tr6 conducts and passes the output of inverter circuit IV4. Inverter circuit IV4 has received the signal of "0" on signal lines 55. Accordingly, the output of transmission circuit Tr6 becomes the signal of "1", and the bit CO on signal line 56 attains the "0" state by the function of inverter circuit IV5.
When the signal on node ND is in the "1" state, transmission circuits Tr6 and Tr3 conduct, and otherwise, transmission circuits Tr4 and Tr5 conduct. The logical operation (adding processing) set forth in the table shown in FIG. 24B is implemented by the structure shown in FIG. 24A.
As illustrated in FIG. 22, multiplication of binary data is implemented by repeating the addition. The number of partial products is equal to the bit number of multiplier Y. The multiplier shown in FIG. 22 is a 4 bit multiplier. Generally in the field of computers today, data of at least 54 bits is utilized. Accordingly, multiplication of data of at least 54 bits will be necessary. In this case, adder array 5 shown in FIG. 22 will be extremely large in scale. If the adder array is large in scale, since a signal is sequentially transmitted across adding circuits included therein, extremely large signal delay results in the adder array. The signal delay increases with the number of stages of the adding circuits. The number of stages of the adding circuits is in proportion to the number of partial products in multiplication.
Therefore, the Booth algorithm is often utilized for efficiently performing multiplication by reducing the number of partial products. The Booth algorithm is a process of multiplying negative numbers represented in 2's (two's) complement notation without correction.
In the Booth algorithm, the data bits of the multiplier Y is divided into groups. FIG. 25 illustrates one example of dividing the multiplier Y into groups. FIG. 25 illustrates group division of the second order Booth algorithm. Each group includes three bits. One bit is shared between adjacent groups (the bit illustrated in shading in FIG. 25). One group produces one partial product. The number of partial products is about 1/2 in the case of the second order Booth algorithm. Generally, when one group includes m bits, it is referred to as the (m-1)-th Booth algorithm, and the number of partial products to be produced is about 1/(m-1). The Booth algorithm will be described in conjunction with the following expressions.
The multiplier Y is given by the following equation (1) when represented in 2's complement. ##EQU1## where yn is a sign bit which indicates whether the multiplier Y is positive or negative. A data bit yi is a binary number "1" or "0". 2.sup.j attached to each bit is the binary weight of each data bit.
In equation (1), if n is an even number and y0=0, the multiplier y will be developed as in the following equation (2): ##EQU2## where y0=0, n is an even number The product X.multidot.Y of the multiplier Y and the multiplicand X is given by the sum of partial products. Therefore, if three bits y2i, y2i+1, and y2i+2 are known, an operation necessary for producing the partial product is decided. The relation between the three bits y2i, y2i+1, and y2i+2 and the operation executed based on their values is set forth in Table 1.
TABLE 1 ______________________________________ Second Order Booth Algorithm y.sub.2i+2 y.sub.2i+1 y.sub.2i Operation ______________________________________ 0 0 0 0 0 0 1 X 0 1 0 X 0 1 1 2X 1 0 0 -2X 1 0 1 -X 1 1 0 -X 1 1 1 0 ______________________________________
The operations executed in the second order Booth algorithm are 0, .+-.X, and .+-.2X.
The number twice as large as the multiplicand X, in other words 2X can readily be produced by a shift circuit for shifting the multiplicand X in the direction of more significant bits by 1 bit. "-" operation can be implemented by bit inversion and addition of "1". Therefore, if the operation to be executed is decided by the values of the three bits, the multiplication operation can be performed at a high speed. The Booth algorithm is not limited to the second order but there exist higher orders such as third order, fourth order, . . . Booth algorithms. The decomposition of the multiplier Y in the third order Booth algorithm and the operation to be executed at that case are given in equation (3) and Table 2. ##EQU3##
TABLE 2 ______________________________________ Third Order Booth Algorithm y.sub.3i + 3 y.sub.3i+2 y.sub.3i+1 y.sub.3i Operation ______________________________________ 0 0 0 0 0 0 0 0 1 X 0 0 1 0 X 0 0 1 1 2X 0 1 0 0 2X 0 1 0 1 3X 0 1 1 0 3X 0 1 1 1 4X 1 0 0 0 -4X 1 0 0 1 -3X 1 0 1 0 -3X 1 0 1 1 -2X 1 1 0 0 -2X 1 1 0 1 -X 1 1 1 0 -X 1 1 1 0 0 ______________________________________
FIG. 26 shows the structure of a multiplier utilizing a Booth algorithm. In FIG. 26, the multiplier includes a register circuit 1a for holding multiplicand data X, and a register circuit 1b for holding multiplier data Y. A decode circuit 3 for decoding the multiplier data Y provided from register circuit 1b through a multiplier data line 2b according to the Booth algorithm, and outputting a signal representing the result of decoding, an adder array 5 for producing partial products based on the multiplicand data X applied from register circuit 1a on a multiplicand data line 2a and a control signal applied on a decoding result output line 8 according to the Booth algorithm and for producing an intermediate sum by adding up the partial products, and a final adder chain 10 for receiving the output data from adder array 5 through an output line 9 and performing a final addition. Data representing the result of multiplication X.multidot.Y which is produced by multiplying the multiplicand data X by the multiplier data Y is transmitted onto a signal line 11 from final adder chain 10.
Adder array 5 includes a selector circuit for producing a partial product by performing a selection operation in response to the control signal applied from decode circuit 3 onto decoding result output lines 8.
When performing a decoding operation according to the second order Booth algorithm, decode circuit 3 produces the control signal enabling the operation given in Table (1) to be executed. 0, X, and 2X are produced from the selector circuit in response to the control signal. -X and -2X are produced only by sign inversion (bit inversion and addition of "1"). The internal structure of adder array 5 includes a selector for performing a selection operation based on a decoding result output from decode circuit 3 in place of the AND circuit in the structure of the multiplier shown in FIG. 22. The arrangement of adders are shifted toward the direction of more significant bits by 2 bits for each stage (in the case of the second order Booth algorithm).
The number of decoding result output lines 8 from decode circuit 3 is decided by the bit number of multiplier data Y and the order of Booth algorithm to be executed.
FIG. 27 is a diagram showing a circuit for performing multiplication of 4 bit multiplier data Y and 4 bit multiplicand data X. The multiplier data Y includes a bit y0 (=0) in addition to bits y1-y4. The multiplicand data X includes bits x1-x4. Decoder circuit 3 includes a decoder 3a1 for decoding the bits y0, y1, and y2 and transmitting the result of the decoding to an output line and transmitting the result of the decoding onto a signal line 8b.
Adder array 5 includes a shift circuit 102 for shifting the multiplicand data X (bits x1-x4) held at register circuit la toward more significant bits by 1 bit and producing 2X, a selector circuit 104 for receiving the outputs of register circuit la and shift circuit 102, and selecting a corresponding operation in response to the decode result signal on output line 8a, thereby producing a first partial product, a selector circuit 106 for receiving the multiplicand data X from register circuit 1a and the shift data from shift circuit 102, and selecting a corresponding operation based on the result of decoding on an output line 8b, thereby producing a second partial product, and adding circuits AD102 and AD104 for producing an intermediate sum by adding up the partial products produced by selector circuits 104 and 106.
Final adder chain 10 which outputs a final multiplication result based on the output of adder array 5 includes an adding circuit AD106 for receiving the carry output of adding circuit AD100 and the sum of adding circuit 102, an adding circuit 108 for receiving the carry output of adding circuit AD102, the sum of adding circuit AD104, and the carry output of adding circuit 106, an adding circuit AD110 for receiving the output of selector circuit 106, the carry output of adding circuit AD104, and the carry output of adding circuit 108, and an adding circuit AD112 for receiving the most significant bit output of selector circuit 106, and the carry output of adding circuit ADl10. Adding circuits AD100, AD102, AD104, AD106, and AD112 are half adders, while adding circuits AD108 and AD110 are full adders.
The multiplier shown in FIG. 27 performs multiplication in accordance with the second order Booth algorithm. An 8 bit multiplication result is produced from the 4 bit multiplication data Y and the 4 bit multiplicand data X. Selector circuits 104 and 106 each have a 5 bit capacity. This is because the operation of 2X is performed and the state shifted toward more significant bits by 1 bit is expressed. The least significant bit of shift circuit 102 is set to be 0. Shift circuit 102 shifts the multiplicand data X provided from register circuit 1a toward more significant bits by 1 bit.
As illustrated in FIG. 27, according to the second order Booth algorithm, the number of partial products to be produced is 2, adding circuits are provided substantially in 2 stages, and therefore the number of stages of adding circuits is greatly reduced as compared to the structure of the multiplier shown in FIG. 22. Adding circuit AD112 included in final adder chain 10 may be formed of a full adder, and adding circuit AD110 may receive a carry output at its carry input and have its one input grounded.
If multiplication is performed according to the second order Booth algorithm, the number of partial products produced is 2, which is equivalent to half the number of partial products produced by the usual multiplier shown in FIG. 22. Thus, a high speed multiplication can be performed.
FIG. 28 is a diagram showing a conceptual structure when multiplication is performed according to the third order Booth algorithm. In FIG. 28, a third order Booth algorithm decode circuit 3 performing a decoding operation according to the third order Booth algorithm includes decoders 30a, 30b, . . . , 30p for receiving a prescribed set of 4 bit data from the bits y0-yr of the multiplier Y, respectively. Each of decoders 30a-30p produces a signal selecting a corresponding operation by performing a decoding operation shown in Table 2 based on the value of provided 4 bit data.
The multiplier further includes a constant multiple circuit 200 for multiplying the multiplicand data X (bits x1-xn) by a prescribed constant, in other words for producing .+-.X, .+-.2X, .+-.3X, and .+-.4X, and selector circuits 202a, 202b, . . . , 202p provided correspondingly to the decoders 30a-30p of decode circuit 3 for selectively outputting one of outputs from constant multiple circuit 200 in response to control signals from output lines 8a-8p. Selector circuits 202a, 202b, . . . and 202p produce the first partial product, second partial product, . . . , and p-th partial product, respectively.
The multiplier further includes an adder 204 for adding up the partial products from selector circuits 202a-202p. Adder 204 includes both of the adder array and the final adder chain shown in FIGS. 26 and 27.
When multiplication is performed according to the third order Booth algorithm shown in FIG. 28, the number of partial products to be produced is p and is 1/3 the bit number of the multiplicand data Y.
The circuit for producing .+-.3X in constant multiple circuit 200 executes an addition of 2X+X by inputting the multiplicand data X. .+-.2X and .+-.4X are produced by shifting operation of the multiplicand data X. The double sign .+-. is uniquely decided depending upon whether or not the sign is inverted. The triple value 3X cannot be produced simply by such a shifting operation and a sign inversion, and therefore the triple value 3X is produced by a shifting operation and an adding operation utilizing the multiplicand data X, in other words by performing an operation of producing 2X and addition of 2X+X. Then .+-.3X is produced based on inversion/non-inversion of the sign.
As described above, the number of partial products to be produced is reduced utilizing the Booth algorithm in multiplication of binary numbers, which enables a high speed multiplication operation. For example, consider the case of multiplication of 54 bit data. The number of partial products is 54 in usual multiplication without using the Booth algorithm. When the second order Booth algorithm is utilized, the number of partial products produced is reduced to 27. For the third Booth algorithm, the number of partial products produced is 18. More specifically, when the n-th Booth algorithm is utilized, the number of partial products produced is reduced to 1/n as compared to usual multiplication, and therefore operation time necessary for multiplication can be reduced.
The Booth algorithm however suffers from a disadvantage. When the multiplier Y is decoded according to the second order Booth algorithm, the value 2X twice as large as the multiplicand X is necessary for producing a partial product. Also if the multiplier Y is decoded according to the third order Booth algorithm, the values twice, three times and four times as large as the multiplicand X will be necessary for producing the partial products. Furthermore, when the multiplier Y is decoded according to the fourth Booth algorithm, the values twice, three times, four times, five times, six times, seven times, and eight times as large as the multiplicand X will be necessary for required partial products.
In the case of binary numbers, a power multiple of 2 such as twice, four times, and eight times can readily be produced by shifting data. However, values three times and fives as large cannot be produced only by such a shifting operation. When 3X is produced, the operation of (2X+X) should be executed. A long period of time is necessary for performing the adding operation. More specifically, as the digit number of the internal operation increases, carrying the digits takes time, and operations for producing values three times and five times as large as multiplicand such as 3X and 5X cannot be performed at a high speed. Accordingly, the multiplication cannot be performed at a high speed as well.
More specifically, when the third Booth algorithm is applied, although the number of partial products produced is reduced, a longer period of time will be necessary for producing a value three times as large as the multiplicand X prior to executing addition of the produced partial products, and eventually time required for the operation increases. In this case, if the bit number of the multiplicand X increases, delay in the circuit for producing 3X naturally increases.
When the second order Booth algorithm is applied, only a value twice as large as the multiplicand X, 2X is necessary. The value 2X twice as large as the multiplicand X can readily be produced by a shifting a operation. As opposed to the case of producing the value 3X three times as large as the multiplicand X, a long period of time is not required. Therefore, the second order Booth algorithm significantly reduces the number of partial products produced and is useful in performing multiplication operation.
In view of the foregoing, in the designs of conventional multipliers, a Booth algorithm larger than the second order is not utilized. This is because third or larger order Booth algorithm must use odd number multiples which cannot produce constant multiples of the multiplicand X by a shifting operation, time delay in the circuit as a result cancels the effect of reducing the number of produced partial products, and furthermore delay in the circuit for producing odd number multiples overwhelms the effect of reducing the number of partial products with the increase of the bits of the multiplier and the multiplicand.
However, when multiplication is performed according to the second order Booth algorithm, the number of partial products is reduced only to 1/2 at best. If the number of data bits further increases in the near future in the field of information processing, a higher order Booth algorithm must be used in order to reduce the number of partial products produced, thereby performing a high speed multiplication operation.
It is an object of the present invention to provide a multiplier capable of executing a high speed multiplication operation utilizing a Booth algorithm of the third or higher order.
When the bit number of data to be multiplied increases, a large load is imposed on multiplicand data line 2a and the data line 8 of Booth decode circuit 3. This is because a number of selector circuits are associated with multiplier data line 2a as illustrated in FIG. 27, and the decode result output line must drive all the associated selector circuits.
For example, as illustrated in FIG. 27, selector circuits 104 and 106 include selector circuits for performing selection operations on a bit-by-bit basis. The load associated with signal lines 2a and 8 increases with the increase of the bit number of data to be multiplied. Accordingly, in the output line 8 of the Booth decode circuit, for example, it takes a long period of time for the result of decoding to reach the farthermost end of output line 8. This is because of propagation delay in the signal line.
If the bit number of data to be multiplied increases as such, signal propagation delay increases regardless of the use/non-use of a Booth algorithm, which makes it difficult to perform a high speed multiplication. This applies to the structure of the multiplier shown in FIG. 22 as well.
Furthermore, when a Booth algorithm is utilized, adding circuits included in the adder array and adding circuits included in the final adder chain can perform an adding operation only after the Booth decode circuit has decoded the data and an operation to be selected by the decoding operation has been decided. This is because until then the output of the selector circuit is not decided. If the decoding operation of the Booth decode circuit has completed, it takes time for the result of decoding to reach the farthermost end of output line 8. The time delay must be accounted for in order to perform an accurate multiplication. Accordingly, difficulty in performing a high speed operation is encountered.
As described above, a conventional multiplier, particularly a multiplier utilizing a Booth algorithm is encountered with the following disadvantages.
(1) In a Booth algorithm of third order or larger, a long period of time is necessary for producing data for odd number multiples of a multiplicand X such as a triple 3X which impedes a high speed operation characteristic. PA1 (2) Since a large load is connected to the output line of a Booth decode circuit, signal propagation delay is present in the output line, and high speed multiplication is impeded. PA1 (3) When a Booth algorithm is utilized, an adding circuit cannot execute an adding operation until the output of a Booth decode circuit is decided. This makes it difficult to perform a high speed multiplication. PA1 (1) The second order Booth algorithm and the third order Booth algorithm are mixed. PA1 (2) A Booth decode circuit is provided on the side of the less significant bits of a multiplicand. PA1 (3) An adding circuit is operated during the period of decoding by the Booth decode circuit, and an adding result is selected based on the result of the decoding.