This invention relates to a cell array multiplier design which uses unique internal connections to increase the speed of the multiplier output by minimizing overall delay, given sum and carry delays for a full adder.
Traditionally, array multiplier performance was increased in one of three standard ways. First, recoding techniques, such as Booth's algorithm, have been used to reduce the number of partial products to be added within the multiplier. By using Booth's algorithm to reduce the partial products, the number of full adder (FA) columns was reduced, as shown by U.S. Pat. No. 4,168,530. Another example using Booth's algorithm is U.S. Pat. No. 4,575,812, which reduces the propagation delay by connecting each sum to a FA within the same column and two stages beyond that sum's stage. However, recoding techniques are not overly efficient when implementing both signed two's complement and unsigned multiplication, which the present invention does in fact implement. In order to provide signed and unsigned multiplication, the Baugh-Wooley multiplication algorithm was used as a design base.
The second area of potential performance enhancement involves variations in circuit design intended to speed up the longest signal path within the multiplier. The longest path is generally the carry path. However, multiplier arrays often do not have the freedom to use circuit design techniques, since they are implemented using a predetermined set of circuits, such as those used on the DCP-C CMPhilo chip. As a result, the logical operations available for design are limited to the specified chip's set, such as a CMPhilo bookset.
The third area of performance enhancement involves variations on the interconnection scheme, such as a Wallace tree or Binary tree, which are used to reduce the signal path lengths. The assumption made when using a Binary tree is that carry delays through the full adder are negligible. Unfortunately, this is rarely the case. As a result, the Binary tree algorithm produces a less than optimal interconnection strategy. The improvement provided by the present invention improves on the interconnection strategy.
Previous multipliers have used a very basic structure, as disclosed in U.S. Pat. No. 4,748,583, which shows sum and carry signals from each adder connected to adders within the next stage. Alternatively, VLSI circuits have been adapted to increase multiplier speed by reducing the number of stages and connecting carry signals to full adders beyond the next stage. For instance, U.S. Pat. No. 4,752,905 discusses connecting each carry signal to a FA two stages beyond that carry signal's stage, while U.S. Pat. No. 4,556,948 discloses connecting each carry signal to FA's up to six stages thereafter.
However, most custom designs are concerned with making interconnections regular or symmetrical in order to simplify the chip's layout, which in turn affects the chip density and performance. This symmetry of layout affects the choice of interconnection schemes and also determines whether more than one adder type will be used within the multiplier; for instance, whether to use a carry look-ahead (CLA) adder in some of the stages instead of a less complex ripple adder.
The present invention is not limited to the less complex ripple adders, since it utilizes a standard cell design system. The present invention also is not concerned with the effects of complex interconnection schemes upon the density of the chip, since the standard cell design system was a more limiting factor upon chip density. Moreover, the present invention is not concerned with using more complex CLA adders within the multiplier, since the performance improvement achieved by using these adders was considered to outweigh the increase in area. Because of the constraints imposed by the standard cell design system, the only area of improvement was in the interconnection of the full adders which sum the partial products, in conjunction with using a CLA adder for the last stage. Using a CLA adder for the last stage of a multiplier by itself would not create a significant improvement in performance, since a CLA adder achieves maximum performance when its inputs arrive simultaneously, whereas inputs to the last stage of the inventive multiplier arrive as nearly simultaneously as possible, because of the unique interconnection scheme.
An object of the present invention is to provide a standard cell array multiplier within which a unique interconnection scheme provides a high speed multiplier.
A second object of this invention is to provide an interconnection scheme which uses irregular or asymmetrical connection layouts within the multiplier to reduce the longest signal path through the multiplier.
Another object of this invention is to provide a method for determining the optimal interconnection scheme, whereby the connections are dependent upon each adder's sum delay and carry delay.