This invention relates to functional components for digital data processing systems and more particularly for an arithmetic device for performing high speed multiplication.
The speed at which calculations may be performed in many computing devices is of prime consideration. This need is amplified when the computing systems are used for scientific purposes, typically requiring a large number of iterations to be performed during scientific modeling studies. In order to reduce the time required to perform calculations, it is customary to employ arithmetic devices, such as adders and multipliers, which function in the parallel-operating mode, i.e., operating upon all of the bits in a word at the same time.
As the multiplication performed by these arithmetic devices is more complicated than addition, the speed of the multipliers incorporated into a computing system to a very large extent bears upon the ultimate speed of the system.
It has been customary in many earlier computing devices to multiply by generating and then accumulating one partial product for each bit of the multiplier, thereby involving a shift operation for each bit of the multiplier. These shift operations were normally carried out serially and were quite time consuming, involving a delay time which was cumulative of the individual shift operations. Improvements to this multiplication scheme have involved skipping a string of zeros or ones. However, such improvements have not yielded a multiplier having the speed desired of the subject invention.
Other multiplication schemes have also been incorporated into earlier computing devices. These schemes have included: binary multiplication utilizing squaring techniques wherein the two operands were manipulated so that the multiplier logic need only perform two squaring functions followed by a subtraction; and pipeline binary multiplication wherein a continuous stream of operands was fed into a particular arithmetic unit where multiplication was performed on each pair of operands on successive operational cycles so that a continuous stream of products resulted. Neither of these schemes, however, provided a multiplier with a processing speed required of the subject invention.
A. D. Booth published "A Signed Binary Multiplication Technique," Quart. J. Mich., Appl. Math, Vol. 4, Part 2, 1951, which has become known as "Booth's Algorithm" and which has enabled the construction of much faster multipliers than previously available. Booth's Algorithm provides for a uniform shift method which examines two or more bits of the multiplier at the same time to determine the correct multiple of the multiplicand to be added to the partial product. This method requires no sign correction for a two's complement number, and the decoding of the multiplier may be begun from either direction. Amdal et al, U.S. Pat. No. 3,840,727 teach a pipeline implementation of Booth's Algorithm.
Another technique which has been utilized to greatly enhance multiplier speed when partial product, carry look-ahead structure is employed is "column compression." The partial product, carry look-ahead implementation for multipliers generates a matrix of partial products which must be reduced to provide the complete product. This technique is a take-off of an empherical manipulation as taught for matrix algebra in the Linear Algebra area of mathematics.
S. Singh and R. Waxman have taught a version of column compression in IBM Technical Disclosure Bulletin, Vol. 14, No. 1, June 1971, "Partial Product Array For High-Speed Multiply Using Adders For Multiple Additions." However, the circuit taught by Singh et al is serial in nature and takes 4-clock periods to achieve a complete product.
What is desired now, however is an even faster multiplication circuit than currently available from the prior art.
An objective of this invention is to provide a binary multiplication circuit for use in a digital data processing system which integrates a Booth's Algorithm scheme and a column compression scheme for performing multiplication.
A second objective of this invention is to provide such a circuit having true parallel operation of the column compression multiplication structure wherein every input to the column compressor takes the same propagation delay to get to the output.
Another objective of this invention is to provide such a circuit wherein the complete multiplication product is achieved in a one-clock period operation of the column compressor.
A further objective of this invention is to provide such a circuit being implemented in current mode logic, monolithic, large scale, integrated circuitry.