1. Technical Field of the Invention
This invention relates to digital processing logic circuits, and more particularly relates to a low power multiplier circuit.
2. Background Art
Digital processing circuits are being designed to operate at lower and lower supply voltages. This is being driven by various forces, including consumer demand for portable personal computers and ever decreasing device dimensions in integrated circuits.
To retain desired performance, or speed, in multiplier circuits as supply voltages decrease it is desirable to exploit parallelism in the multiplier architecture. Parallel multipliers include, e.g., array multipliers and Wallace-tree multipliers. Parallel multiplier architectures tend to operate at higher speed than non-parallel multipliers. Unfortunately, parallel multipliers also usually dissipate a large amount of power during operation. As a general matter, array multipliers tend to have lower performance (slower speed) and consume more power, as compared with Wallace-tree multipliers.
One of the major sources of power dissipation in parallel multipliers is the large number of spurious logic transitions that occur at the internal nodes of such multipliers. Such multipliers are typically implemented in the form of some kind of logic array in which multiple additions of intermediate values, such as partial products and partial sums of partial products may be performed, including the addition of carry products at various places throughout the array. As the intermediate values propagate through the circuit, the logic states of the various logic gates, such as adders, may change, sometimes many times, before the final state of the inputs of such logic gates is finally resolved. This is discussed in, e.g., Analysis and Reduction of Glitches in Synchronous Networks, by J. Leijten, et al., European Design and Test Conf., Dig. Tech. papers, pp. 398-403, March 1995. Those authors suggest deploying flipflops in the circuit, which are clocked at the same time to deliver their outputs together, as an approach to reduce spurious transitions.
Another approach suggested to reduce such spurious transitions is made in A Low Power 16 by 16 Multiplier Using Transition Reduction Circuitry, by C. Lemonds, et al., Intl. Workshop on L/P Design, Dig. Tech. papers, pp. 139-142, April 1994, in conjunction with multipliers including Booth encoders. As is known, a Booth encoder applies logic to the inputs of a multiplier that reduces the number of partial products required to be created in the array. Those authors propose putting latches on the outputs of the Booth encoder portion of a multiplier. The latches are then clocked in a precise sequence so as to deliver the encoded inputs to the sequential stages within the array more closely in time with the respective carry and sum output signals from the previous adder/multiplexer stage in the array with which the encoded inputs are to be combined.
However, both of the aforementioned approaches present problems. For example, in the Leijten, et al., approach the numerous flipflops introduce additional delay in the form of the propagation delay of the flipflop itself, multiplied by the number of stages in which the flipflops are deployed. In addition, the flipflops take up valuable integrated circuit area. As for the Lemonds, et al., approach, the clock signal must be delivered to the multiplier circuit, requiring additional wiring into the circuit, and the clock timing must be controlled precisely to produce the desired result. In addition, the latches themselves consume power, which tends to defeat the very purpose for which they are used, although in some applications the net result can be an improvement in power dissipation. Also, the latches take up integrated circuit area.
Thus, it is desired to have a multiplier circuit employing parallel architecture that provides good performance at low power. The present invention provides just such a multiplier.
In accordance with the principles of the present invention, there is provided, according to a first embodiment, a digital multiplier for multiplying a plurality of multiplicand signals representing a multiplicand and a plurality of multiplier signals representing a multiplier. In it, a plurality of intermediate results signals are generated from the multiplicand signals and from said multiplier signals. A plurality of adder circuits for adding the intermediate results signals are provided to generate a plurality of final result signals representing the result of multiplying the multiplicand and the multiplier, wherein at least some of the adder circuits receive at the inputs thereof at least two signals representing intermediate addition results. Finally, a plurality of delay elements are placed in selected signal lines so as to delay the arrival of at least one of the signals representing intermediate addition results to the at least some of the adder circuits so as to synchronize the arrival of the signals input to the at least some of the adder circuits.
In accordance with a second embodiment of the present invention there is provided a Wallace-tree multiplier for multiplying a multiplicand signal and a multiplier signal. A plurality of partial product signals are generated from the multiplicand signals and the multiplier signals. Also provided are a plurality of adder circuits for adding the partial product signals to generate result signals representing the result of multiplying the multiplicand and the multiplier, arranged in a Wallace-tree configuration, at least some of the adder circuits being a (4:2) counter circuit. The (4:2) counter circuit includes a first three-input adder circuit generating as outputs a first sum signal and a first carry-out signal and receiving as inputs three of the four inputs to the four-input adder circuit, and also includes a delay element receiving as an input the fourth of the four inputs to the four-input adder circuit and providing as an output the signal applied to its input but delayed by a predetermined time interval. Finally, the (4:2) counter includes as well a second three-input adder circuit generating as outputs a second sum signal and a second carry-out signal, receiving as inputs a carry-in signal, the first sum signal and the output signal of the delay element. The predetermined time interval is selected so as to delay the arrival of the fourth input to the four-input adder circuit to the second three-input adder circuit by a time selected so as to cause the fourth input signal to arrive at the second three-input adder circuit closer in time to the time the other two inputs of the three-input adder circuit arrive at the three-input adder circuit.
In accordance with a third embodiment of the present invention there is provided a full adder circuit receiving three inputs and providing a sum output signal and a carry output signal. The fill adder circuit includes a three input exclusive OR logic element for generating the sum output signal provided at an output thereof, a three input majority selector logic element for generating the carry output signal provided at an output thereof. The three input exclusive OR logic element and the three input majority selector are made of pass gate field effect transistor devices, arranged so as to perform the exclusive OR function and the majority selection function, respectively, and also arranged such that the same number of pass gate field effect devices are disposed between the inputs and said outputs in the three input exclusive OR logic element and in the three input majority selector logic element.
These and other features of the invention will be apparent to those skilled in the art from the following detailed description of the invention, taken together with the accompanying drawings.