The present invention generally relates to a logic circuit having carry select adders, and more particularly to a logic circuit including two-stage carry select adders capable of processing multiple bits such as 32, 64 and 80 bits in parallel at the same time.
In conventional multiple bit parallel full adders, each of two integers to be processed is divided into a plurality of units each consisting of a predetermined number of digits. Then, adding operation is made between the corresponding units, and the operation results obtained for every unit are combined. This operation procedure is intended to enhance the operation speed.
A carry select adder is known as a multiple bit parallel full adder. FIG. 1 shows a conventional 16-bit carry select adder (see Japanese Laid-Open Patent Application No. 61-221822 or 61-226836). Referring to this figure, the illustrated 16-bit carry select adder includes four 4-bit-length partitioned adders CSA, four sets of two-input multiplexers MPX, and four carry selectors CS. A and B are binary numbers each represented with M digits, where binary numbers A and B are an augend and an addend, respectively, in obtaining the arithmetic sum. Each of the binary numbers A and B has an amount of information corresponding to 16 bits when M=16. Each of the 16-bit binary numbers A and B is divided into four portions (hereafter, each portion is referred to as a partitioned bit set), A3-A0, B3-B0, A15-A12, B15-B12, which are supplied to the corresponding 4-bit partitioned adders CSA. Each bit contained in each of the partitioned bit sets assumes `0` or `1`.
C.sub.-1 is a real carry signal supplied from the digit immediately below the lowest-order digit out of 16 digits. C3.sup.0, C7.sup.0, C11.sup.0 and C15.sup.0 are carry signals which are propagated to higher-order digits when the real carry signal C.sub.-1 is `0`. C3.sup.1, C7.sup.1, C11.sup.1 and C15.sup.1 are carry signals which are propagated to higher-order digits when the real carry signal C.sub.-1 is `1`. S0-S15 form a sum output signal S (=A+B).
FIGS. 2A through 2C are views of a conventional Manchester type carry adder. FIG. 2A shows a 32-bit full adder, which includes positive/negative logic blocks A and B each including a full adder amounting to 4 bits. FIG. 2B shows a carry bypass circuit used in the logic block of FIG. 2A. The illustrated carry bypass circuit includes inverters and transfer gates TG, and outputs a carry signal Cj by bypassing, for every four bits, a real carry signal C.sub.IN supplied from the lower-order digit. Transfer gates may be constituted by complementary metal oxide semiconductor (CMOS) transistors each having a gate length equal to or less than 1.5 [.mu.m].
FIG. 2C illustrates a positive logic full adder out of 4-bit full adders used in the logic block of FIG. 2A. When a combination of input data Ai and Bi (Ai, Bi) is (0, 1) or (1, 0), the full adder is kept in a waiting state where it waits for the supply of the real carry signal C.sub.i-1 propagated from the lower-order bit. In the case where all the four combinations of Ai and Bi are (0, 1) or (1, 0) in the 4-bit full adder, by bypassing the real carry signal C.sub.IN through the bypass circuit of FIG. 2B, it becomes possible to shorten a critical path where the real carry signal (Cj=C.sub.IN) is propagated through all the transfer gates amounting to 4 bits. In addition, bypass circuits BP1 and BP2 (FIG. 2A) are provided for two 12-bit portions, each of which consists of three blocks each having 4 bits as a unit. Thereby, it is possible to propagate the carry signal Cj at high speeds.
FIGS. 3A through 3C illustrate a conventional 32-bit two-stage carry look ahead adder, which is also known as one of the multiple bit parallel full adders. Referring to FIG. 3A, a block labeled ULB is a carry propagate/generate unit, and a block labeled BCLA is a 4-bit-length block carry look ahead unit. The illustrated adder also includes 8-bit-length carry look ahead (CLA) units 2, and a 32-bit-length sum unit 3. Each of the binary numbers A and B is represented with 32 digits such that A=A0-A31 and B=B0-B31. Pi (=P0-P31) is a carry propagate signal, and Gi (=G0-G31) is a carry generate signal. The carry propagate signal Pi is defined as Pi=Ai.sym.Bi, and the carry generate signal Gi is defined as Gi=Ai.multidot.Bi. Si (=S0-S31) is a digit of the sum output signal (S=A+B).
FIG. 3B shows the structure of the 4-bit-length block carry look ahead unit BCLA with respect to a unit consisting of the zeroth bit to the third bit. As shown, the block carry look ahead unit BCLA includes logic gates such as AND gates, OR gates, and receives carry propagate signals P0-P3 for the zeroth bit (digit) to third bit (digit), carry generate signals G0-G3 for the zeroth bit to the third bit, and the real carry signal C.sub.-1 propagated from the digit which is one digit lower than the lowest-order digit of the illustrated block. Then the carry look ahead unit BCLA generates, from these input signals, a block look ahead carry propagate signal P0*, a block look ahead carry generate signal G0*, and real carry signals C0, C1, and C2.
FIG. 3C shows the structure of the 8-bit-length CLA unit. The illustrated CLA unit receives block look ahead carry propagate signals P0* to P7* relating to the zeroth to seventh bits, the block look ahead carry generate signals G0*-G7* relating to the zeroth to seventh bits, and the real carry signal C.sub.-1 propagated from the digit which is one digit lower than the lowest-order digit of the illustrated CLA unit. Then the CLA unit generates, from these input signals, real carry signals C3, C7, C11, . . . , C27 and C31 for every four digits. The CLA unit is made up of 2 to 9-input AND gates (or NAND gates), and 2 to 9-input OR gates (or NOR gates). In an actual circuit configuration of the CLA unit, a logic gate having 5 inputs or over, such as a 9-input AND gate is configured with combination of gates having smaller numbers of inputs. For example, a 9-input AND gate is constructed with four 3-input AND gates (or three 3-input NAND gates and one 3-input NOR gate).
The aforementioned carry select adder needs an extremely large number of structural elements with an increase of digits to be processed at a time, because the carry selectors CS must contain a more-than-linearly increased number of elements with the increase of processing digits. Furthermore, the adder needs to contain more elements than twice that of the ripple carry adder, to generate two sets of signals Si(1), Si(0) and the real sum signal Si. Additionally, the conventional carry select adder causes a considerabley large delay in processing time.
The aforementioned Manchester type carry adder may be constructed by a small number of structural elements, as compared with the carry select adder. However, there are the following disadvantages. As described previously, this type includes transfer gates directly cascaded by four stages or over, each of which is constituted by CMOS transistors. In this case, a signal waveform becomes dull, which arises from series resistance of CMOS transistors as well as junction capacitance between source and drain thereof. For these reasons, the processing speed is not so high. As a result, a large amount of power is consumed irrespective of a reduced number of structural elements.
The aforementioned 32-bit two-stage carry look ahead adder needs a large number of logic gates, which leads to an increase in time taken to propagate the carry signal. Additionally, the processing speed decreases with an increase of the fan-in number. Further, it takes extremely long to obtain the operation result for the following reason. That is, the carry signals C0-C31 relating to all the digits are propagated through BCLA.fwdarw.CLA.fwdarw.BCLA after the carry generate signals Gi and the carry propagate signals Pi are applied to the circuit. Thereafter, the signal processing by the 32-bit-length sum unit 3 is carried out, and then the sum output signal Si (=S0-S31) is obtained.
An improvement on carry signal processing has been proposed in Japanese Laid-Open Patent Application No. 57-147754. The proposed improvement is illustrated in FIG. 4. The illustrated improvement is a 44-bit adder. The feature of the improvement is that the number of digits to be processed in partitioned adders CSAi (i=1 to 8) increases towards higher-order digits, or in other words, increases with an increase of `i`. For example, the partitioned adder CSA1 consists of a single adder AD to which digits A0 and B0 are supplied, and the carry select adder CSA8 consists of 8 adders to which corresponding digits A36, B36 through A43, B43 are supplied.
The carry signals C0.sup.1 and C0.sup.0 are output from the partitioned adder CSA1 with a delay time of 1D after the carry propagate and generate signals Pi and Gi are generated. `D` is a unit delay time taken for a signal to pass through a transfer gate. In this way, the carry signals are output from the partitioned adder CSAi with a delay time of iD. The real carry signal C0 is determined with a total delay time of 2D. That is, the carry signals C0.sup.1 and C0.sup.0 are calculated beforehand with respect to the case where the real carry signal C.sub.IN is `1` which is supplied from the digit which is one digit lower than the lowest-order digit, and the case where the real carry signal C.sub.IN is `0`. It takes a delay time of 1D to carry out this calculation. Then, a multiplexer MPX5 relating to the zeroth digit selects one of the carry signals C0.sup.1 and C0.sup.0 on the basis of the value (`1` or `0`) of the real carry signal C.sub.IN. A delay time of 1D is needed for this selection. As a result, the total delay time is 2D to obtain the real carry signal C0 relating to the zeroth digit. The real carry signal C0 thus obtained is supplied to a multiplexer MPX5 associated with the partitioned adder CSA2. The carry signals C2.sup.1 and C2.sup.0 are output from the partitioned adder CSA2 with a total delay time of 2D. At this time, the real carry signal C0 is determined as described previously. Therefore, the multiplexer MPX5 associated with the partitioned adder CSA2 selects either the carry signal C2.sup.1 or C2.sup.0 on the basis of the value of the real carry signal C0. In this way, the carry signal is sequentially determined. It takes a delay time of 10D to obtain the real carry signal C43 relating to the highest-order digit. This delay time corresponds to a total time taken to obtain the summation result.
The above-mentioned improvement presents a relatively high-speed. However, as the number of digits to be processed increases, a time taken to obtain the operation result increases drastically.