1. Technical Field of the Invention
The present invention relates to a multiplying circuit suitable for an integrated circuit on a semiconductor chip and, more particularly, to a high-speed multiplying circuit using the Booth's algorithm.
2. Description of the Prior Art
Generally, a multiplicand X, a multiplier Y and the product P (=X.multidot.Y) are defined by the following equations for the multiplication of 16 bits.times.16 bits: ##EQU1##
Usually, in order to construct a hardware circuit for a parallel multiplication of n bits.times.n bits, there are required an n.sup.2 number (e.g., 256 in the case of 16 bits) of logic blocks. Each block has an AND gate for generating partial product and a full adder for adding the partial products. Even if the "carry save adder (CSA)" system is adopted, the carry must be propagated through 2n (e.g., 32 in the case of 16 bits) blocks so that the speed of operation cannot be increased.
While, according to the multiplying circuit using the Booth's algorithm (as is disclosed in Japanese Patent Publication No. 57-1014), the operation speed becomes faster than that of the above-mentioned parallel multiplying circuit. In this circuit, the multiplier Y is divided into groups of three bits (of which one bit is overlapped between the preceding and succeeding groups), and each group is decoded by a decoder referring to a bit pattern of three bits. Thereafter any of partial products .+-.2X, .+-.X and 0 is produced in accordance with the decoded results for each group. A total sum is obtained by accumulating all of the produced partial products.
The Booth's algorithm is featured by the facts that the partial product producing means can be reduced, since the concept of 2's complement can be introduced into the partial products, and that no correction of a sign bit is required for the multiplications, as is well known in the art. According to the Booth's algorithm, since a plurality of bits of the multiplier are simultaneously decoded, high-speed multiplication can be obtained.
In a high-grade digital computation or a complex calculation, however, it is required to multiply data of long bit lengths by each other. Especially in a 16 or 32 bits microprocessor with a high-performance, it is predicted that higher-speed and more precise computations will be required. In order to satisfy these requirements, it is conceivable to increase the number of bits of the group to be decoded for reducing the number of partial product producing steps. For example, 4 or more bits of the multiplier may be decoded at the same time. In this case, however, if a group of 4 bits are decoded, it is necessary to generate partial products corresponding to .+-.4X, .+-.3X, .+-.2X, .+-.X and 0 for the multiplicand X in accordance with the decoded results of 4 bits. Table 1 shows the partial products in the 4-bit decoder (wherein: y.sub.3i-2 to y.sub.3i+1 designate 4-bit patterns of the multiplier to be decoded; and P.sub.p designates the partial products).
TABLE 1 ______________________________________ y.sub.3i+1 y.sub.3i y.sub.3i-1 y.sub.3i-2 Pp y.sub.3i+1 y.sub.3i y.sub.3i-1 y.sub.3i-2 Pp ______________________________________ 0 0 0 0 0 1 0 0 0 -4X 0 0 0 1 X 1 0 0 1 -3X 0 0 1 0 X 1 0 1 0 -3X 0 0 1 1 2X 1 0 1 1 -2X 0 1 0 0 2X 1 1 0 0 -2X 0 1 0 1 3X 1 1 0 1 -X 0 1 1 0 3X 1 1 1 0 -X 0 1 1 1 4X 1 1 1 1 0 ______________________________________
Now, upon producing of the partial products .+-.4X, .+-.3X, .+-.2X, .+-.X and 0, respectively, the partial products .+-.4X and .+-.2X, i.e., even number multiplication of the multiplicand X can be generated easily by shifting the multiplicand X two bits and one bit, respectively, by means of shift operations. However, to produce the partial product .+-.3X, the multiplicand X has to be multiplied an odd number of times. This operation cannot be accomplished at high speed by the simple multiplying circuits proposed in the prior art. For a 4th-order Booth's algorithm, moreover, the partial products .+-.8X, .+-.7X, - - - , .+-.X and 0 are required, but the partial products produced with an odd multiplier (.+-.7X, .+-.5X and .+-.3X except .+-.X) cannot be easily produced. If it is possible, a long period of time must be spent to produce the partial products 7X, 5X and 3X. Further, even if the partial products 7X, 5X and 3X can be produced by a conventional multiplying circuit, since this production has to be performed after the decoding operation is terminated, a high-speed operation is very hard to achieve.
In addition, hardware elements for producing both the partial products of odd multipliers and even multipliers is required in each block for producing a partial product. Therefore, since the circuit pattern of each block becomes complex, the design of the multiplication circuit is very difficult. Moreover, it will be also difficult to integrate the circuit on a single semiconductor chip. Particularly, in an integrated circuit on a chip, it is important to align the same pattern in an array shape to make the LSI technology easy.