A well-known technique for improving the computation speed is by constructing a parallel adder/subtractor with a fast carry/borrow speed-up system.
The state of the art presents various kinds of carry systems such as the carry look ahead (CLA) system, reference to "High-Speed Arithmetic in Binary Computers" by O. L. MacSorley, Proceedings of The IRE Jan. 1961 pp. 67: a recurrence solver (RS) system, reference to "A Comparison of Alu Structures for VLSI Technology" by S. Ong and D. E. Atkins, IEEE 1983; and a skip carry (SC) technique, reference to "On Implementing Addition in VLSI Technology" (Variable Block Adder) by Vojin G. Oklobdzija and Earl R, Barnes, Journal of Parallel and distributed Computing 5. 716-728 (1988), and "An 8.5-ns 112-b Transmission Gate Adder with a Conflict-Free Bypass Circuit", by T. Sato, M. Sakate et al., IEEE Journal of Solid State Circuits, Vol. 27, No. 4 (1992); and other techniques, all of which are used for binary bits or BCD digits, 1's or 2's complement, fixed or floating point, sign and magnitude representation with or without a carry-in, etc.
It would be too laborious to show to those who are familiar with the prior art the differences and disadvantages of all the various types of carry systems. Relevant prior art patents on carry systems and adders are as follows:
3,805,045 to Larsen,
3,990,723 to Betany etal.
3,993,891 to Beck et al.
4,118,786 to Levine etal.
4,319,335 to Rubinfeld
4,323,981 to Nakamura
4,504,924 to Cook et al.
4,584,661 to Grundland
4,607,176 to Burrows et al.
4,660,165 to Masumoto
4,858,168 to Hwang
4,870,681 to Sedlak
4,882,698 to Young
4,905,180 to Kumar
4,918,640 to Heimsch et al.
Some typical prior art carry system disadvantages are:
the CLA system as an example, requires two AND-NOR gates delay for each CLA level, and this introduces two units of delay, where the unit is taken to be the propagation delay through an AND-NOR gate, PA0 the RS system as an example, doubles the fanout of the most loaded output signals in the critical signal path (the worst case path) for every additional carry level, PA0 the SC technique, as an example, requires impractical, irregular array construction, having a very high gate fan-in as the adder is expanded. PA0 the layout design regularity, is only partly uniform. PA0 1) An adder is, in general, composed of three parts: the input part, the carry part and the sum part. PA0 2) The carry part is based on a partitioning number R, where R=2 is a binary partitioning. R=3 is a ternary partitioning etc. PA0 3) The construction of a binary (partitioned) carry system part requires the implementation of four different signal functions. PA0 4) The CLA system is based on R=4 but has an example based on R=5 and there are other implementations. The recurrence solvers technique is basically based on R=2. Carry systems based on R=3 also exist. PA0 5) The embodiment of a fast carry system is based on the implementation of a pair of signals, the Generate (G) and the Propagate (P) signals. PA0 6) For R=2, there are normally required two different signal expressions for each polarity which requires four different gate constructions as shown in prior art FIGS. 1a and 1b. These are respectively the cells for True and False polarity, each including two different gates and having as an example the following signal expression pairs for two bit position (i+1 and i); for True input signals, G(i+1, i)=G(i+1)+P(i+1).multidot.Gi and P(i+1, i)=P(i+1).multidot.Pi, and for False input signals, G(i+1, i)=G(i+1).multidot.P(i+1)+G(i+1).multidot.Gi and P(i+1, i)=P(i1)+Pi.
Due to the technical limits of practical gate construction (fan-in and fan-out), the structure of a carry system is partitioned into groups of bits in parallel at level one of a carry system, then into groups of groups in parallel at level two of a carry system, etc. The conventional CLA system is based on partitioning of four (bits, groups) and the RS system is based on binary tree expansion, partitioning of two.
The prior art of adder design can be summarized as follows:
The general concept description of fast carry systems using the generate and propagate functions pairs will now be described, with reference to the prior art FIGS. 1a and 1b.
At each sequential carry level of a carry system there are produced pairs of carry Generate (G) and carry Propagate (P) functions which are further combined with Carry-in (Cin) input signals producing further Carry-out (Cout) output signals: EQU Cout(k+1)=G.beta.+P.beta..multidot.Cin(i+1) (Eq. 1a)
or, EQU Cout(k+1)=G.beta..multidot.P.beta.+G.beta..multidot.Cin(i+1) (Eq. 1b)
where .beta. represents a bunch of contiguous bits numbered from k to i+1, k.gtoreq.i+1. For CLA system .beta. is a function of grouped groups, for RS system .beta. is a function of the numbered carry level L.gtoreq.1.
Normally, two different logical gate combinations are required for implementing, in a binary configuration (binary tree expansion), the signals of Eqs. 1a and 1b which are also required for the implementation of the typical True G and False G carry-generate functions which are as follows: EQU G.beta.=G.beta.2+.beta.2.multidot.g.beta.1 (Eq. 2a) EQU G.beta.=G.beta.2.multidot.P.beta.2+G.beta.2.multidot.G.beta.1 (Eq. 2b)
where .sym.2 and .beta.1 represent two respective bits bunches [k.gtoreq.(j+1)] and [j.gtoreq.(i+1)].
The typical True P and False P carry-propagate functions further require two different logical gates as follow: EQU P.beta.=P.beta.2.multidot.P.beta.1=.pi.P.beta.=.pi.P.beta.2.multidot..pi.P. beta.1 (Eq. 3a) EQU P.beta.=P.beta.2+P.beta.1=.SIGMA.P.beta.=.SIGMA.P.beta.2=.SIGMA.P.beta.1 (Eq. 3b)
where .pi. represent the logical AND of terms and .SIGMA. represent the logical OR of terms.
Four different logical INVERTING gates (and inverters) are required for practical design of a carry system having partitioning of 2 and eight different gates are required for conventional CLA having partitioning of 4.
Inversion is an inherent component in logic design techniques, which is used by existing practical design techniques and requires complicated logic combinations.
Texas Instruments, in its component SN74S181, offers essentially, the same two gates circuitries for generating G or Y (pin 17) and P or X (pin 15) for True and False input signals.
Other disadvantages concern the design with the ECL complementary output logic technology where output signals are not effectively used and the design with CMOS technology where gates are not effectively designed and used.
Generally speaking, the various prior art approaches in designing adders with any technology show poor performance, more particularly with regard to layout regularity (uniformity), level counts and organization, fan-in, fan-out and speed.