1. Field of the Invention
The present invention relates to a multiplication element, and more particularly to a multiplication element using a Wallace tree.
2. Description of the Background Art
Multiplication is one of the operations which are performed most often in a computer. When forming a high-speed calculating system, it is necessary to enhance the speed of multiplication. As a method for implementing high-speed multiplication, a Booth algorithm for modifying a multiplier to reduce the count of partial products has been known well. Furthermore, a multiplication element using Wallace tree which adds partial products like a tree to sequentially reduce the count of the partial products has been known well.
FIG. 19 is a block diagram showing the structure of a multiplication element using the above two methods. By way of example, such a structure has been described in Japanese Patent Application Laid-Open Gazette No. 3-177922. A multiplicand 101 and a multiplier 102 are formed by 16 bits respectively, for example. A Booth encoder 103 outputs a multiplier which is obtained by modifying the multiplier 102 (hereinafter referred to as a "modified multiplier").
In general, if the modified multiplier is found by using an n-th Booth algorithm, the count of partial products can initially be reduced to one n-th. "Initially" implies the count of partial products above is obtained by the operation of a multiplicand and 1 bit of the modified multiplier. The partial product "initially" obtained will be hereinafter referred to as a "0th partial product".
When the order of the Booth algorithm is increased, a circuit scale necessary for the 0th partial product becomes larger and the time necessary for the generation of the 0th partial product also becomes longer. Consequently, a second Booth algorithm is used very often. Also in the example shown in FIG. 19, the second Booth algorithm is employed in which Booth encode elements 45 to 52 that output the modified multipliers for 8 bits corresponding to half of the 16 bits are provided in the Booth encoder 103.
The outputs 104 to 111 of the Booth encode elements 45 to 52 are modified multipliers and, at the same time, control signals of shifter/inverters 113 to 120 respectively. In the second Booth algorithm, a multiplicand is kept as it is if the modified multiplier is 1, the multiplicand is shifted by 1 bit if the modified multiplier is 2, and the multiplicand is inverted if the modified multiplier is a negative number. Thus, the operation is performed. Accordingly, the shifter/inverters 113 to 120 keep the multiplicand 101 as it is, shift the multiplicand 101 by 1 bit or invert the multiplicand 101 based on the outputs 104 to 111 respectively, so that 0th partial products 121 to 128 are generated.
The place of each of the Booth encode elements 45 to 52 is increased in this order. Consequently, the place of each of the 0th partial products 121 to 128 generated by the shifter/inverters 113 to 120 which are controlled by the outputs 104 to 111 is also increased in this order. Since the second Booth algorithm is employed, the place of the 0th partial product is increased by 2 bits as that of the modified multiplier is increased by 1 bit.
The 0th partial products 121 to 128 are input to a Wallace tree portion 129 and added together like a tree. Consequently, a first partial product, a second partial product, . . . are generated. When the order of the partial product is increased, the number of the partial products is decreased.
FIG. 20 is a block diagram schematically showing the structure of the Wallace tree portion 129. "Schematically" means that the state in which bits of the partial products are matched is not shown. The Wallace tree portion 129 has first adders 138 and 139, and a second adder 140. The 0th partial products 121 to 124 are given to the first adder 138 to generate a pair of (that is, two) first partial products 141. The 0th partial products 125 to 128 are given to the first adder 139 to generate a pair of first partial products 142. Furthermore, two pairs of (that is, four) first partial products 141 and 142 are given to the second adder 140 to generate a pair of final partial products (second partial products) 130.
FIG. 21 is a circuit diagram showing the structure of the first adder 138. 4-input (with carry-in) 2-output (with carry-out) adding elements P.sub.0101, to P.sub.01n are sequentially connected in series. Each of the adding elements P.sub.0101 to P.sub.01n comprises a carry-in terminal CI, input terminals I1 to I4 for receiving 1 bit of each of the partial products 121 to 124 respectively, a sum terminal S for outputting the low order bit of an addition result which is obtained from 5 bits given to the carry-in terminal CI and the input terminals I1 to I4, and a carry terminal C and a carry-out terminal CO which output the same high order bit (CI+I1+I2+I3+I4=C.times.2+CO.times.2+S). The carry-out terminal CO of an adding element P.sub.01i is connected to the carry-in terminal CI of an adding element P.sub.01(i+1) (1.ltoreq.i.ltoreq.n-1).
The first and second adders 139 and 140 have the same structure. As will be described below, n is set to 23 in the first adders 138 and 139 and n is set to 32 or 24 in the second adder 140.
FIG. 22 is a circuit diagram showing the structure of the 4-input 2-output adding element. Such combination of logic gates can implement the adding elements for performing the above operation. The output of the carry-out terminal CO is not affected by the output of the carry-in terminal CI.
The pair of final partial products 130 are added together in a final adding portion 131 so that a multiplication result 74 is obtained. In order to perform the operation at a high speed, the final adding portion 131 uses the carry look ahead method very often.
FIGS. 23 to 26 are circuit diagrams showing, in detail, only the connecting relationship between the shifter/inverters 113 to 120 and the first adders 138 and 139 and second adder 140 in the structure of the multiplication element according to the prior art shown in FIG. 19. FIG. 23 is continued on FIG. 25 by a virtual line Q4Q4 and on FIG. 24 by a virtual line Q5Q5. FIG. 26 is continued on FIG. 24 by a virtual line Q6Q6 and on FIG. 25 by a virtual line Q7Q7.
In the Wallace tree portion 129, thus, the first adders 138 and 139 and the second adder 140 which are components are provided together with the shifter/inverters 113 to 120.
On the assumption that the multiplicand 101 has 16 bits which might be shifted left by 1 bit, the shifter/inverters 113 to 120 are formed by 17-bit shift/invert elements. For example, the shifter/inverter 113 comprises shift/invert elements B.sub.0101 to B.sub.0117 which are transversely arranged sequentially from the least significant bit to the most significant bit. Similarly, the shifter/inverters 114 to 120 comprise shift/invert elements B.sub.0201 to B.sub.0217, B.sub.0301 to B.sub.0317, B.sub.0401 to B.sub.0417, B.sub.0501 to B.sub.0517, B.sub.0601 to B.sub.0617, B.sub.0701 to B.sub.0717, and B.sub.0801 to B.sub.0817, respectively.
The first adder 138 comprises adding elements P.sub.0101 to P.sub.0123. The 0th partial products 113 to 116 have places which are sequentially varied by 2 bits. Consequently, the number of the adding elements is 17+(4-1).times.2=23. In the same manner, the first adder 139 comprises adding elements P.sub.0201 to P.sub.0223.
The least significant bit of the 0th partial product given to the first adder 138 is different from that of the 0th partial product given to the first adder 139 by 4.times.2=8 bits. Consequently, the second adder 140 which receives a pair of first partial products output from each of the first adders 138 and 139 comprises (23+8=31) adding elements P.sub.0301 to P.sub.0331 and an adding element P.sub.0332 which receives the output of the carry terminal C on the most significant bit of the first adder 139.
In the drawings, wirings having a mark of "/" are used for the batch of plural bits, and the numeral attached beside the mark indicates a bit count. If the bit count is 2, the numeral is not attached.
The 0th partial product 121 obtained from the shifter/inverter 113 is given to the adding elements P.sub.0101 to P.sub.0117 every bit. In the same way, the 0th partial products 122 to 128 are given to the adding elements P.sub.0103 to P.sub.0119, P.sub.0105 to P.sub.0121, P.sub.0107 to P.sub.0123, P.sub.0201 to P.sub.0217, P.sub.0203 to P.sub.0219, P.sub.0205 to P.sub.0221, and P.sub.0207 to P.sub.0223, respectively.
One part of the first partial products 141 obtained from the sum terminals S of the adding elements P.sub.0101 to P.sub.0123 are given to the adding elements P.sub.0301 to P.sub.0323 of the second adder 140, respectively. The other part of the first partial products 141 obtained from the carry terminals C of the adding elements P.sub.0101 to P.sub.0123 are given to the adding elements P.sub.0302 to P.sub.0324 of the second adder 140 respectively. The output from the carry terminal C of each adding element has a higher order than that from the sum terminal S thereof by 1 bit.
In the same way, one part of the first partial products 142 obtained from the sum terminals S of the adding elements P.sub.0201 to P.sub.0223 are given to the adding elements P.sub.0309 to P.sub.0331 of the second adder 140, respectively. The other part of the first partial products 142 obtained from the carry terminals C of the adding elements P.sub.0201 to P.sub.0223 are given to the adding elements P.sub.0310 to P.sub.0332 of the second adder 140, respectively.
Ordinarily, the above array formed by the shifter/inverters and the adders takes the width of a circuit for 1 bit equally in order to avoid the complexity of wirings which mutually transmit signals. Accordingly, as addition proceeds based on the Wallace tree (that is, the order of the adder is increased), the transverse width necessary for the arrangement of the adders is increased. According to examples shown in FIGS. 23 to 26, the width of the second adder 140 is greater than those of the first adders 138 and 139.
Two methods are employed so as not to require the above large structure. First of all, the adding elements P.sub.0301 to P.sub.0308 are omitted. The adding elements P.sub.0301 to P.sub.0308 receive two inputs or less from the adding elements P.sub.0101 to P.sub.0108 respectively. Accordingly, the adding elements P.sub.0301 to P.sub.0308 only output the input values as they are. For this reason, the adding elements P.sub.0301 to P.sub.0308 can be omitted.
Secondly, the array is put in order. FIG. 27 is a block diagram schematically showing the state in which the left ends of the shifter/inverters 113 to 120 and adders 138 to 140 are arranged in a column. Since the relationship of signal transmission among the shifter/inverters 113 to 120 and the adders 138 to 140 is the same as in FIGS. 23 to 26, it is simplified so as to avoid the complexity. The adding elements P.sub.0301 to P.sub.0308 are omitted and the width of the second adder 140 is greater than those of the adders 138 and 139 by 1 bit.
Even if the array is put in order as described above, the width of a region necessary for the arrangement of the multiplication element is nearly determined by the widths (23 to 24 bits) of the first adders 138 and 139 and the second adder 140. The widths necessary for the shifter/inverters 113 to 120 correspond to 17 bits. Consequently, space regions DS1 and DS2 having the widths for 6 or 7 bits are present. Thus, the area cannot be used effectively.
As the number of bits of the multiplier is increased, the number of the 0th partial products is increased so that the space region becomes larger. As the order of the adder is increased, such a tendency becomes more remarkable. In the multiplication element, most of the whole structure is occupied by the array of the shifter/inverters and the adders. Consequently, when the number of bits of the multiplier is increased, the area of the whole multiplication element cannot be reduced to lower the manufacturing cost. Furthermore, a wiring length is increased so that performance degradation might be caused. It is needless to say that the number of bits of information processed by a microprocessor tends to be increased.