The present invention relates to a parallel multiplier and, more particularly, to a parallel binary multiplier using a modified Booth's algorithm, a skip array, and a modified Wallace tree.
The parallel binary multiplier has been widely employed in various systems such as a ALU (Arithmetic Logic Unit) of high-performance computers, a facsimile telegraph, a digital signal processing system, a matrix multiplier, and also for a specially purposed chip so lots of methods have been proposed to reduce chip areas and improve operation speed of the parallel binary multiplier. For example, it is well known that multiplication speed of the parallel multiplier can be considerably improved by using the modified Booth's algorithm as disclosed in "COMPUTER ARITHMETIC" (pp.129.about.212, 1979) and "NIKKEI ELECTRONICS" (pp.76.about.89, May 29, 1978) by John Wiley & Sons Co.
The conventional parallel multipliers are based on various algorithms and techniques. Among many multipliers have been proposed, general multipliers with superior performance are divided into two kinds, both producing n/2 partial product lines in its initial step by the modified Booth's algorithms where n is bit numbers of two inputs of multiplier Y and multiplicand X. The most essential part of such parallel multipliers is a multioperand addition circuit which adds the n/2 partial product lines each other and reduces them to two lines. To realize this adder circuit, a full adder array and the Wallace tree are employed.
The parallel multiplier using the array is arranged into a two-dimensional array structure composed of full adder cells. In this kind of multipliers, the outputs of cells in a present line are sequentially inputted to cells in a next line. Thus, this kind of multipliers have a delay time complexity of 0(n) and have a basically slow multiplication time.
FIG. 1 shows a schematic overall structure of a conventional parallel multiplier using the array. In FIG. 1, a 16-bit multiplicand X is provided to eight multiplicand adder cells CL1, CL2, CL3 . . . , CL8 and a 16-bit multiplier Y is provided to a modified Booth's encoder MBE. Then, the modified Booth's encoder MBE encodes the 16-bit multiplier Y according to the modified Booth's algorithm and provides such encoded outputs to the eight adder cells CL1, CL2, CL3 . . . , CL8, where each encoded output is a 3-bit signal.
The first to eighth line adder cells CL1.about.CL8 respectively add the multiplicand X to the encoded output of the modified Booth's encoder MBE and the first line adder cell CL1 provides its output to the second line adder cell CL2. Then, the output value of the first line adder cell CL1 is added again to the output value of the second line adder cell CL2, thereto the multiplication value of the second line adder cell CL2 is sequentially added and finally provided to a fast adder FAD. For the multiplication of complements, four bits of two least significant bits and its complements are provided to the fast adder FAD from each line adder cell. Thus, a resultant of the fast adder FAD has finally 2n-bit value. In the parallel multiplier, the outputs of each line are sequentially provided to each next line as mentioned above.
Consequently, the multiplication time of two inputs is slow proportionally to the number of bits of the inputs. Thus this multiplier is not suitable for a high speed multiplication even though this type of multiplier is easily applicable for small bits, low speed, and small chip area.
On the other hand, the parallel multipliers using the Wallace tree has faster operation time of 0 ( log n), but a large chip area is required with irregular structure. Thus, this Wallace tree is not suitable for a small chip area and a low cost. Further, as shown in FIG. 6, a carry output is provided after one gate delay and a sum output is provided after two gates delay, since in general CMOS or NMOS circuits the sum is obtained by using the carry output which is in a standby state until the sum is provided so the carry output formerly provided is not directly added. FIG. 2 shows a schematic overall structure of a conventional multiplier using the Wallace tree and FIG. 5 shows a schematic structure of the Wallace tree.
In both multipliers using the array or the Wallace tree, the final step is to add two final lines. These two kinds of the conventional adders still have the problems associated with the multiplication speed and the chip area. Thus, the necessity for more effective multipliers which can improve the multiplication speed and reduce the chip area still remains.