Multiplier circuits are found in virtually every computer, cellular telephone, and digital audio/video equipment. In fact, essentially any digital device used to handle speech, stereo, image, graphics, and multimedia content contains one or more multiplier circuits. The multiplier circuits are usually integrated within microprocessor, media co-processor, and digital signal processor chips. These multipliers are used to perform a wide range of functions such as address generation, discrete cosine transformations (DCT), Fast Fourier Transforms (FFT), multiply-accumulate, etc. As such, multipliers play a critical role in processing audio, graphics, video, and multimedia data.
It is of utmost importance that a multiplier circuit be designed to operate as fast as possible. This is because vast amounts of digital data must be processed within an extremely short amount of time. For example, generating a frame's worth of data for display onto a computer screen or digital camera entails processing upwards of over a million pixels. Often, several multiplication functions must be invoked just to rasterize a single one of these final pixel values. And for real-time applications (e.g., flight simulators, speech recognition, video teleconferencing, computer games, streaming audio/video, etc.), the overall system performance is dramatically dependent upon the speed of its multipliers.
Unfortunately, multiplication is an inherently slow operation. Adding two numbers together requires a single add operation. In contrast, multiplication requires that each of the digits of the multiplicand be multiplied by each digit of the multiplier to arrive at the partial products. The partial products must then be added together to find the final solution. For example, 123.times.456 requires the addition of the three partial products of (123.times.400)=49200+(123.times.50)=6150+(123.times.6)=738 to find the final answer of 56088. As applied to binary numbers, multiplying two 32-bit numbers would necessitate that thirty-two partial products be calculated and then thirty-two add operations need to be performed to add together all of the partial products to find the final solution. Thus, multiplications are relatively time-consuming.
A more efficient method for multiplying together two digital numbers entails the use of a Booth encoder/selector. The concept behind Booth encoder/selectors is to subdivide the multiplier into groups of bits. These bits are then encoded and used to select the appropriate bit patterns which reduces the number of partial products. An example of a prior art Booth encoder/selector is shown in FIG. 1. Although a multiplier utilizing this prior art Booth encoder/selector is faster than a conventional multiplier, it nevertheless takes a certain amount of time for the signals to be processed by the Booth encoder/selector. For instance, this prior art Booth encoder/selector design has a critical path which takes approximately an equivalent of nine NAND gate delays to complete. The critical path is defined as the logical flow through a circuit which takes the longest time to complete. The critical path is the limiting factor for how fast a circuit can complete its processing and is used as a measure of that circuit's speed.
Some designers have attempted to shorten the critical path by optimizing the encoder section. However, an optimized encoder comes at the expense of shifting some of the computational burden onto the selector. Others have attempted to optimize the selector. Again, this comes at the expense of increasing the delay associated with the other parts of the multiplier.
Thus, what is needed is a Booth encoder/selector circuit which has an optimized critical path such that the overall speed of the multiplier is improved. The present invention provides a novel solution whereby the logical design of the Booth encoder/selector according to the present invention is such that the critical path is upwards of twice as fast as typical prior art Booth encoder/selectors. Thereby, multipliers using the present invention's Booth encoder/selector design can operate at a much faster speed.