1. Technical Field
The present invention relates generally to a and more particularly to a multi-level carry look-ahead adder. The invention specifically relates to a multi-level carry lookahead adder implemented as an array of regularly-spaced rows and columns of logic cells in a datapath.
2. Description of the Related Art
A digital computer typically includes a multiplicity of binary adders. At least one binary adder, for example, is used in an integer arithmetic logic unit (ALU) for performing addition, subtraction, multiplication, and division. A floating-point processor requires at least two adders, one for processing the mantissa, and another for processing the exponents. Additional adders are typically used for computing relative addresses for memory access or branch instructions.
In many digital computer designs, the speed of the computer is limited by the time required performing an addition or subtraction in the arithmetic logic unit. The time required for performing an addition or subtraction is typically limited by the time required for generating the "carry out" of the addition or subtraction, because the "carry out" of the addition or subtraction is a logical function of all of the input bits and the "carry in" to the adder or subtractor. Due to the large number of logical inputs defining the carry function, it is impractical to implement the carry function in just a few levels of gates, and instead the carry out is generated by intermediate carry signals that propagate through the adder.
Carry propagation is best understood with reference to a known adder design 20 shown in FIG. 1, which adds an augend A to an addend B and a carry in C.sub.-1 to obtain a sum S and a carry-out C.sub.n-1. It is assumed that A, B, and S are n-bit binary numbers. In other words, A=A.sub.n-1 . . . A.sub.1 A.sub.0, B=B.sub.n B.sub.n-1 . . . B.sub.1 B.sub.0, and S=S.sub.n-1 . . . S.sub.1 S.sub.0.
The adder 20 shown in FIG. 1 generates a number of intermediate functions. A "carry generate" function G is defined as: EQU G.sub.i =A.sub.i .multidot.B.sub.i
The carry generate function G indicates that a carry is originated at the ith stage of the adder. A "carry propagate" function P is defined as: EQU P.sub.i =A.sub.i .sym.B.sub.i
The carry propagate function P is true when the ith stage of the adder will pass the incoming carry C.sub.i-1 to the next higher stage. Moreover, when P.sub.i is generated by an exclusive-OR function between A.sub.i and B.sub.i, then the carry propagate function P also indicates the "half sum" of A.sub.i and B.sub.i. In this case, the carry C.sub.i and the full sum S.sub.i from the ith stage are related to the generate bit G and propagate bit P.sub.i by: ##EQU1##
As illustrated in FIG. 1, the digital logic for the adder 20 can use gates having a low fanin and a low fan-out, and the gates can be arranged as an array of regularly-spaced rows and columns of logic cells in a datapath. The datapath in the adder 20 extends from the top (the A and B inputs) to the bottom (the S outputs). The cells include a first row 21 of propagate-generate bit cells 22, 23; a second row 24 of carry bit cells 25, 26; and a third row 27 of sum bit cells 28, 29. Each propagate-generate bit cell 22, 23 in the ith column or bit position of the adder 20 includes a respective AND gate 31, 32 providing the generate bit G.sub.1, and a respective exclusive-OR gate 33, 34 providing the propagate bit P.sub.i. Each carry bit cell 25, 26 includes a respective AND gate 35, 36 and a respective OR gate 37, 38 which together provide the carry bit C.sub.i. Each sum bit cell 28, 29 includes a respective exclusive-OR gate 40, 41 providing the sum bit S.sub.i.
One disadvantage of the adder circuit 20 is that the speed of the adder is limited by the time for a carry signal to propagate left-to-right through the chain of carry bit cells 25, 26 from the carry input C.sub.-1 to the carry output C.sub.n-1. In particular, the carry propagation time is a linear function of the number of columns n in the adder, and therefore the adder 20 is very slow when it has a large number n of columns or bits. A known solution to this problem is to use carry look-ahead logic to reduce the time for generating the more significant carry bits. The carry look-ahead logic has logic gates for more directly solving the carry function: EQU C.sub.n-1 =G.sub.n-1 +G.sub.n-2 P.sub.n-1 +. . . +C.sub.-1 P.sub.0 P.sub.1 . . . P.sub.n-1
In general, the equation C.sub.i =G.sub.1 +P.sub.i .multidot.C.sub.i-1 is known as a "recurrence relation," and repeated application of the "recurrence relation" computes the carry function. Cells of logic gates which together compute the carry function are known as "recurrence solvers."
As disclosed in Kai Huang, Computer Arithmetic, John Wiley & Sons, New York, N.Y., 1979, pp. 84-90, the carry function can be computed by "block carry generate" G* and "block carry propagate" P* functions in multi-level circuits. Shown in FIG. 3.13 on page 90 of Huang, for example, is a two-level carry look-ahead adder with a 32-bit word length arranged in an 8-by-4 configuration. The carry generation logic includes an upper level of eight four-bit block-carry look-ahead units and a lower level having an 8-bit carry look-ahead unit. Each four-bit block-carry look-ahead unit generates block carry generate and block carry propagate functions, for i=3, 7, 11, 15, 19, 23, 27, and 31: EQU G.sub.1 *=G.sub.i +G.sub.i-1 P.sub.i +G.sub.i-2 P.sub.i P.sub.1-1 +G.sub.i-3 P.sub.1 P.sub.i-1 P.sub.i-2 EQU P.sub.i *=P.sub.i-1 P.sub.i-2 P.sub.i-3
The lower-level unit generates the carry functions C.sub.i for i=3, 7, 11, 15, 19, 23, 27, and 31 according to: EQU C.sub.i =G.sub.i *+G.sub.i-4 *P.sub.1 *+. . . +C.sub.-1 P.sub.i *P.sub.1-4 * . . . P.sub.3 *
Disadvantages of the circuit in FIG. 3.13 of Huang are the need for multi-input logic gates, and the absence of regular gate cells for the carry logic at the columns or bit positions in the data path of the adder.
General design techniques for high-speed and area-efficient very-large-scale integrated circuit (VLSI) technology has been the subject of continuing research. As observed by Ong et al., "A comparison of ALU structures for VLSI technology," Proceedings of the 6th Symposium on Computer Arithmetic, IEEE, Piscataway, N.J. (1983), pp. 10-16, there is a continuing need to reevaluate the design techniques in the context of developments in VLSI circuit technology. Furthermore, recent work in complexity of algorithms, particularly the solution of recurrence relations, suggests new candidate structures for generating the carry vector and raises questions as to their practicality in modern logic design practice. Floor plans for two-bit and four-bit look-ahead carry assimilations for 16-bit adders are shown in FIGS. 5 and 6 of Ong et al. A floor plan of a 16-bit adder suggested by recurrence solvers is shown in FIG. 9 of Ong et al., and this floor plan includes four rows of carry-logic cells.
A carry-skip scheme is disclosed in Oklobdzija et al., "Some optimal schemes for ALU implementation in VLSI technology," Proceedings of the 7th Symposium on Computer Arithmetic, IEEE, Piscataway, N.J. (1985), pp. 2-8. The carry-generate portion, which consumes a large amount of logic, is eliminated. As in a carry look-ahead adder, the bits to be added are divided into groups. A circuit is provided for detecting when a carry signal entering a group will ripple through the group. When this condition is detected, the carry is allowed to skip over the group.
Graph representations for designing area-time efficient VLSI adders are disclosed in Han et al., "Fast area-efficient VLSI adders," Proceedings of the 1987 Symposium on Computer Architecture, IEEE, Piscataway, N.J. (1987), pp. 49-56. When a prefix graph is used as a basis for designing binary addition circuitry in VLSI, each node of the graph represents a set of logic equations. Thus, each node can be thought of as a processing element that will be expanded from being a point in the graph to occupy a fixed amount of area in the layout. For binary addition, four types of processing elements can be used: pggen, black, white, and sum. The pggen cell produces initial p and g signals (carry propagation and generation signals). The black cell comprises a pair of p signals and a pair of g signals to generate a p and g signal at a lower level. Two different types of black cells are used: a positive input, negative output cell; and a negative input, positive output cell. The white cell is a simple inverter that inverts a p signal and a g signal. The sum cell generates the sum bit from a propagate bit, a generate bit, and two carry bits. Because the carries produced by the carry generation circuitry alternate between being positive and negative, there are two types of sum cells: one type takes two carries without inversion, and the other takes two carries with inversion. The carry look-ahead adder based on the hybrid prefix algorithm is densely packed by using a folding method. The folding method places two levels of the prefix graph into one level of the layout, since space is available to embed cells.
As is evident from the above references, recurrence solvers have the advantage that the gate levels required to calculate the carry for large groups of bits grows slowly as a function of the number of bits. But the previously implemented or proposed recurrence solvers have had high fan-out, many long interconnections, or excessive levels of gates, which have resulted in a relatively slow complementary metal-oxide-semiconductor (CMOS) implementation.