Along with the development of higher speed processors, an equal demand for higher speed parallel multipliers has evolved in order to enhance image signal and digital signal processing. As sub-micron geometry technologies have matured and material processes have been refined, practical high-speed multiplier structures using a variety of algorithmic approaches have been realized and implemented in VLSI circuits. Higher speed multipliers are advantageous particularly in Modularly Configured Attached Processors (MCAPs) using Multichip Modules (MCMs).
Optimality for n-bit by n-bit integer multiplication is defined by the AT.sup.2 measure of complexity, where A is the area of the multiplier chip, and T is the computation time, or the total propagation delay between the input n-bit multiplicand and n-bit multiplier, and the 2n-bit product output. Any multiplier of two n-bit integers must satisfy AT.sup.2 =O(n.sup.2), and A=O(n). Fan-in constraints of VLSI logic gates result in optimal lower boundary times of T=O(log.sub.2 n), which in turn place the lower limit for T in the range of log.sub.2 n to n.sup.1/2, for which an AT.sup.2 -optimal multiplier may exist.
AT.sup.2 -optimality ranges of O(n.sup.2 (log.sub.2 n).sup.3), and propagation delay times in the range T=O((log.sub.2 n).sup.2) to T=O(n.sup.1/2) have been realized using Discrete Fourier Transforms (DFTs) for computing convolutions. None of these designs, however, attain an optimal T=O(log.sub.2 n). The Wallace tree and Dadda counting algorithms achieve optimal computation time, but are impractical for VLSI design. Divide and conquer techniques combined with redundant operand representations have achieved optimal computation times with an AT.sup.2 =O(n.sup.2 (log.sub.2 n).sup.2). It is therefore an object of the present invention to design a multiplier circuit having optimal computation time and amenability for VLSI design.