It is instrumental for many applications to have a block that adds n inputs together. An output of this block is a binary representation of the number of high inputs. Such blocks, called parallel counters (L. Dadda, Some Schemes for Parallel Multipliers, Alta Freq 34: 349-356 (1965); E. E. Swartzlander Jr., Parallel Counters, IEEE Trans. Comput. C-22: 1021-1024 (1973)), are used in circuits performing binary multiplication. There are other applications of a parallel counter, for instance, majority-voting decoders or RSA encoders and decoders. It is important to have an implementation of a parallel counter that achieves a maximal speed. It is known to use parallel counters in multiplication (L. Dadda, On Parallel Digital Multipliers, Alta Freq 45: 574-580 (1976)).
A full adder is a special parallel counter with a three-bit input and a two-bit output. A current implementation of higher parallel counters i.e. with a bigger number of inputs is based on using full adders (C. C. Foster and F. D. Stockton, Counting Responders in an Associative Memory, IEEE Trans. Comput. C-20: 1580-1583 (1971)). In general, the least significant bit of an output is the fastest bit to produce in such implementation while other bits are usually slower.
The following notation is used for logical operations:                ⊕-Exclusive OR;        v-OR;        ^-AND;        -NOT.An efficient prior art design (Foster and Stockton) of a parallel counter uses full adders. A full adder, denoted FA, is a three-bit input parallel counter shown in FIG. 1. It has three inputs X1, X2, X3, and two outputs S and C. Logical expressions for outputs areS=X1⊕X2⊕X3,C=(X1^X2)v(X1^X3)v(X2^X3).A half adder, denoted HA, is a two bit input parallel counter shown in FIG. 1. It has two inputs X1, X2 and two outputs S and C. Logical expressions for outputs areS=X1⊕X2,C=X1^X2.        
A prior art implementation of a seven-bit input parallel counter illustrated in FIG. 2.
Multiplication is a fundamental operation. Given two n-digit binary numbersAn−12n−1+An−22n−2+ . . . +A12+A0 and Bn−12n−1+Bn−22n−2+ . . . +B12+B0,their productP2n−122n−1+P2n−222n−2+ . . . P12+P0may have up to 2 n digits. Wallace has invented the first fast architecture for a multiplier, now called the Wallace-tree multiplier (Wallace, C. S., A Suggestion for a Fast Multiplier, IEEE Trans. Electron. Comput. EC-13: 14-17 (1964)). Dadda has investigated bit behaviour in a multiplier (L. Dadda, Some Schemes for Parallel Multipliers, Alta Freq 34: 349-356 (1965)). He has constructed a variety of multipliers and most multipliers follow Dadda's scheme.
Dadda's multiplier uses the scheme in on FIG. 3. If inputs have 8 bits then 64 parallel AND gates generate an array shown in FIG. 4. The AND gate sign ^is omitted for clarity so that Aj^Bj becomes AiBj. The rest of FIG. 4 illustrates array reduction that involves full adders (FA) and half adders (HA). Bits from the same column are added by half adders or full adders. Some groups of bits fed into a full adder are in rectangles. Some groups of bits fed into a half adder are in ovals. The result of array reduction is just two binary numbers to be added at the last step. One adds these two numbers by one of the fast addition schemes, for instance, conditional adder or carry-look-ahead adder.
UK patent application Numbers 0019287.2 and 0101961.1 and U.S. patent application Ser. No. 09/637,532 and US patent application entitled “A parallel counter and a multiplication logic circuit” filed on Jan. 25, 2001, the contents of all of which are hereby incorporated by reference, disclose a technique for the modification or deformation of the array prior to array reduction. The array deformation derives the benefit of reducing the depth of the array to a number greater than 2n−1−1 and less than or equal to 2n−1, where n is an integer. This reduction of the maximum depth of the array enables the efficient use of parallel counters in the array reduction step.