This application claims priority to Ser. No. 98402452.1, filed in Europe on Oct. 6, 1998 (TI-27757EU) and Ser. No. 98402455.4, filed in Europe on Oct. 6, 1998 (TI-28433EU).
1. Field of the Invention
This invention relates generally to multiplier and multiplier/accumulator circuits and, more particularly, to improved multiplier and multiplier/accumulator circuits which implement modified Booth""s algorithm and Wallace tree techniques.
2. Background of the Invention
Binary multiplication is an important function in many digital signal processing applications. Some applications further require arithmetically combining a product with the results of previous operations (e.g. forming a sum of products). A versatile multiplier circuit must have the capability to perform these functions in either a two""s complement or an unsigned magnitude notation.
Binary numbers are multiplied very much like decimal numbers. More particularly, each digit of one operand (multiplicand) is multiplied by each digit of the other operand (multiplier) to form partial products and these resulting partial products are then added, taking into account the multiplier digit position place significance.
Circuits for multiplying binary numbers require a relatively large number of circuit elements and thus take up a fair amount of chip area when fabricated on an integrated circuit. For this reason, an ongoing goal of integrated circuit designers is to find ways to implement a multiplier circuit with fewer and fewer circuit elements.
Many techniques are known in the art for reducing the time required to perform a binary multiplication. For example, different encoding methods have been devised which reduce the number of partial products which must be added up to form the final product and for speeding up the addition of partial products. See for example, xe2x80x9cA suggestion for a fast Multiplierxe2x80x9d C. S. Wallace, IEEE Trans. on Electr. Computers, 1964 and xe2x80x9cA Signed Binary Multiplication Techniquexe2x80x9d Andrew D. Booth, Quart. Journal Mech. and Applied Math., Vol IV, part 2, 1951. The modified Booth algorithm described in the Booth paper is in widespread use and is often used in digital multipliers used in an integrated circuit.
In more detail, the so-called modified Booth encoding technique encodes one of the two numbers being multiplied. This approach reduces, usually by a factor of two, the number of partial products generated by the multiplier, thereby reducing the amount of circuitry needed to combine the partial products in arriving at the final product. Unfortunately, the fact that signed binary numbers are typically represented using two""s complement notationxe2x80x94at least when being operated on arithmeticallyxe2x80x94significantly impacts the above-described advantage of modified Booth encoding because of the need to perform so-called sign-bit extension of the partial products before they can be combined.
U.S. Pat. No. 5,038,315 to Rao, describes a way to eliminate the need to perform sign-bit extension in order to combine the partial products by representing the value represented by the sign bits of all the partial products as a two""s complement number. The bits of that numberxe2x80x94referred to as the xe2x80x9csign-bit-valuexe2x80x9d wordxe2x80x94rather than the original sign bits, are then used in the partial product addition. Since (as with all two""s complement numbers) all the bits of the sign-bit-value word are guaranteed to have positive significance (except for the left-most one), the digits of the partial products can then be directly added without the need for sign bit extension. Implementation of this approach requires significantly less circuit areaxe2x80x94as much as 20 percent lessxe2x80x94than previously known multipliers.
Attempts have also been made to speed up the summation of the partial products. In U.S. Pat. No. 4,545,028 to Ware the adder array is divided into blocks so that different blocks can perform different parts of the addition in parallel, even though all of the addition within each block is done in ripple fashion. The first block can only contain four partial products and the remaining blocks must match an arithmetic progression so that carries from one block appear when needed by the next block.
Summation can also be speeded up through use of a carry look-ahead adders. The propagation of carries through a sequential series of adder stages in ripple fashion requires a greater period of time as a function of the larger number of bits in the addends. In a carry look-ahead adder, logic circuitry provides concurrent carry propagation rather than sequential. However, the bit size (or number of bits) of a carry look-ahead adder is limited because the circuit complexity, gate count and chip area rapidly increase as bit size increases.
Circuits which multiply two numbers and sum or accumulate the resulting product with a third number are widely used in signal processing and digital signal processors (DSPs). A typical application of a multiplier/accumulator is the implementation of a finite impulse response (FIR) digital signal filter which sums N products to obtain a sample value at a predetermined time, where N is an integer. A primary objective in performing multiplications and accumulations is to accomplish the mathematical calculation as quickly as possible. However, an increase in speed typically involves an increase in the amount of circuitry and a corresponding increase in the irregularity of structure.
Various attempts to increase the speed of an array multiplier have been made. Stylianos Pezaris in an article entitled xe2x80x9cA 40-ns 17-Bit by 17-Bit Array Multiplierxe2x80x9d in IEEE Transactions on Computers, Vol. C-20, No. 4, April 1971, pp. 442-447, teaches the reduction of propagation of sum signals in an array multiplier. For a conventional multiplier, N rows of adders are required for an N-bit by N-bit multiplier to implement a multiplication in a conventional carry save scheme.
Others have skipped both sum and carry signals over alternate rows of adders in a multiplier array, as taught by Iwamura et al. in xe2x80x9cA 16-Bit CMOS/SOS Multiplier-Accumulatorxe2x80x9d in IEEE International Conference on Circuits and Computers, Sep. 29, 1982, pp. 151-154. Iwamura et al. described a multiplier which utilizes a row skipping technique of carry and sum signals. The skipping technique is used with a conventional array multiplier rather than other methods such as Wallace tree or Booth""s method because of the complicated interconnections and irregularity of structure associated with these other methods. However, by skipping carry and sum signals over the next row, the array is effectively divided into two separate arrays, each of which provides a sum and a carry accumulation. At the bottom of the array, two combining rows of adders (not shown by Iwamura et al.) are required. The combining rows reduce the four outputs (two sums and two carrys) of the two separate accumulator paths to two outputs (one sum and one carry) for carry propagation in a final row. A final row of carry look ahead adders is required to provide the output product.
U.S. Pat. No. 5,504,915 to Rarick provides a modified Wallace-Tree adder for use in a binary multiplier.
Other approaches for multiplier accumulator circuits are provided by U.S. Pat. No. 4,575,812 to Kloker et al., U.S. Pat. No. 4,876,660 to Owen et al., and U.S. Pat. No. 4,831,577 to Wei et al.
U.S. Pat. No. 4,771,379 to Ando et al. provides a digital signal processor with parallel multipliers.
Accordingly, it is a principal aspect of the present invention to provide a circuit and method for fast generation of and parallel summation of partial products with minimum power, complexity, and space in an integrated circuit.
It is another aspect of the present invention to provide an improved, high-speed multiplier accumulator architecture adapted to provide accumulation and adapted to handle either signed or unsigned values.
It is yet another aspect of the present invention to provide high-speed binary multiplication with a parallel adder architecture which can be implemented with standard IC technology.
It is also an aspect of the present invention to provide a circuit employing a plurality of multiplier accumulators for improved multiplication and arithmetical processing.
It is a further aspect of the present invention to provide an improved high speed multiplier circuit for multiplying two numbers or multiplying two numbers and arithmetically combining the result with a third number.
The present invention provides a MAC unit, having a first binary operand X, a second binary operand Y, a third binary operand, Booth recode logic for generating a plurality of partial products from said first and second operands, a Wallace tree adder for reducing the partial products and for selectively arithmetically combining the reduced partial products with said third operand, a final adder for generating a final sum, and a saturation circuitry for selectively rounding or saturating said final sum.
The present invention also provides a dual MAC, having first inputs associated with a first MAC for producing a first output, second inputs associated with a second MAC for producing a second output, first accumulator for receiving said first output, and second accumulator for receiving said second output.
These aspects and advantages of the present invention will become more apparent from the following detailed description, when taken in conjunction with the accompanying drawings.