1. Field of the Invention
The present invention relates to a digital multiplying circuit which is suitable for use in a parallel multiplying circuit using, for example, a Booth's algorithm.
2. Description of the Prior Art
FIG. 1 shows an example of a parallel multiplying circuit using a conventional second Booth's algorithm to which the present invention can be applied. The description will now be made with respect to the case where a multiplicand X consists of ten bits of (x.sub.9, x.sub.8, . . . , x.sub.0) which is a 2's complement code and a multiplier Y consists of ten bits of (y.sub.9, y.sub.8, . . , y.sub.0) which is a 2's complement code and where the product (X.multidot.Y) of both of the multiplicand X and multiplier Y is obtained.
In FIG. 1, a reference numeral 1 denotes a register in which the multiplicand X is stored and 2 indicates a register in which the multiplier Y is stored. The multiplicand X is supplied to selectors 3, 4, 5, 6, and 7. The lower significant two bits y.sub.0 and y.sub.l of the multiplier Y and 0 are supplied to an encoder 8. The three bits y.sub.1, y.sub.2 and y.sub.3 of the multiplier Y are supplied to an encoder 9. The three bits y.sub.3, y.sub.4 and y.sub.5 are supplied to an encoder 10. The three bits y.sub.5, y.sub.6 and y.sub.7 are supplied to an encoder 11. The three bits y.sub.7, y.sub.8 and y.sub.9 are supplied to an encoder 12. Each of the encoders 8 to 12 generates a 3-bit output. The selectors 3 to 7 are respectively controlled by these outputs, so that partial products PA, PB, PC, PD, and PE each of which consists of eleven bits are formed.
When it is now assumed that the inputs to the encoders 8 to 12 are y.sub.i+2, y.sub.i+1 and y.sub.i (with y.sub.i+2 =y.sub.1, y.sub.i+1 =y.sub.0 and y.sub.i =0 for the special case of encoder 8) and their outputs are e and the partial products which are obtained as the outputs of the selectors 3 to 7 are PP, these partial products PP are as shown below in the second Booth's algorithm.
______________________________________ y.sub.i+2 y.sub.i+1 y.sub.i e : PP ______________________________________ 0 0 0 0 : 0 0 0 1 +1 : +X 0 1 0 +1 : +X 0 1 1 +2 : +2X 1 0 0 -2 : -2X 1 0 1 -1 : -X 1 1 0 -1 : -X 1 1 1 0 : 0 ______________________________________
On the other hand, in an alternative embodiment (not illustrated) having the constitution such that selectors 3 to 7 perform the arithmetic operations of 1's complement and thereby to obtain negative values, one bit is added to carry out the correction for performing the negative expression of 2's complement arithmetic operations, so that the number of bits of partial product becomes twelve.
The partial products PA and PB which are respectively outputted from the selectors 3 and 4 are supplied to an adder 13. The output of the adder 13 and the partial product PC outputted from the selector 5 are supplied to an adder 14. The output of the adder 14 and the partial product PD outputted from the selector 6 are supplied to an adder 15. The output of the adder 15 and the partial product PE outputted from the selector 7 are supplied to an adder 16. In the addition by the adders 13 to 16, a predetermined weight is imparted to each partial product from the selectors 3 to 7 and is added. Namely, the partial product PB is shifted to the left by two bits and is added to the partial product PA. Similarly, the partial products PC, PD and PE are respectively shifted to the left by two bits and are added to the addition outputs of the adders 13, 14 and 15 at the front stage, respectively. This shifting operation to the left by two bits upon addition can be executed only by shifting the relative bit locations of the two addition inputs. The output of the adder 16 becomes the product (X.multidot.Y) and it is stored in a register 17.
In the parallel multiplying circuit shown in FIG. 1, the selectors 3 to 7 and the four-stage adders 13 to 16 are interposed between the registers 1 and 17; while the encoders 8 to 12 and the selectors 3 to 7 and the four-stage adders 13 to 16 are interposed between the registers 2 and 17. Therefore, the delay time in the multiplication becomes large and a high-speed operation cannot be expected in standard devices such as CMOS, TTL, etc. As a result, there is a problem such that it is impossible to multiply a signal having a high data rate such as, for instance, a digital color video signal.
As a method of solving such a problem, a method whereby the pipeline processing is carried out is considered. Namely, as shown in FIG. 2, registers 18 and 19, registers 20 and 21, registers 22 and 23, and registers 24 and 25 are respectively interposed on the two input sides of each of the adders 13, 14, 15, and 16. Further, in order to match the timings, a register 27 is interposed between the selector 5 and the register 21; registers 28 and 29 are interposed between the selector 6 and the register 23; and registers 30, 31 and 32 are interposed between the selector 7 and the register 25. With such an arrangement for performing the pipeline processing, it is possible to multiply the input (at least one of the multiplicand X and multiplier Y) which changes at every clock having the highest frequency at which the selectors 3 to 7, encoders 8 to 12 and adders 13 to 16 whose input and output sides are sandwiched by the registers can operate. However, this arrangement causes problems such that the number of registers becomes large and the circuit scale becomes large.
In addition, FIG. 3 shows another arrangement of the case where the parallel multiplying circuit of FIG. 1 was constituted such that the pipeline processing can be performed. Different from the circuit arrangement shown in FIG. 1, this circuit is constituted so as to execute the pipeline processing for every two stages of the adders. In other words, the adders 13 and 14 are combined as a set and the registers 18, 19 and 27 are provided on its input side. Also, the adders 15 and 16 are combined as a set and the registers 22, 23 and 32 are provided on its input side.
Further, as shown in FIG. 4, as a circuit arrangement for adding the partial products, the following arrangement is also possible. Namely, the partial products PA and PB which are outputted from the selectors 3 and 4 are added by an adder 41. This addition output and the partial product PC outputted from the selector 5 are added by an adder 42. The partial products PD and PE outputted from the selectors 6 and 7 are added by an adder 43. The outputs of the adders 42 and 43 are added by an adder 44. Such an arrangement whereby the adders 41 to 44 are connected like a tree as mentioned above can be constituted so as to perform the pipeline processing, as shown in FIG. 5, by respectively interposing registers 45, 46, 47, 48, 49, 50, 51, and 52 on the input and output sides of the adders 41 to 44 and by interposing registers 54, 55 and 56 for matching the timings.
The parallel multiplying circuits with the arrangements shown in FIGS. 3 and 5 have a drawback such that the number of registers becomes large similarly to the arrangement of FIG. 2.