1. Field of the Invention
The invention relates to hardware adder circuits generally and more specifically to adders in which the carry computation is treated as a prefix problem.
2. Description of Related Art
Modulo 2n−1 adders are used in various applications, ranging from applications involving residue number systems (RNS) and applications involving fault-tolerant computer systems through cryptographic applications.
Beginning with the applications involving residue number systems, in RNS logic, each operand is represented by its moduli with respect to a set of numbers comprising the base. None of the numbers of the base may have a common factor with any of the other numbers of the base. Moreover, separate hardware units perform operations in parallel on the numbers in the base, and in order to keep the differences in delay among the units as small as possible, the numbers of the base are chosen to be as close in magnitude to each other as possible. Thus, the base is most often three integers, 2n−1, 2n, and 2n−±1 and addition is done using three adders, a modulo 2n−1 adder, a modulo 2n adder, and a modulo 2n+1 adder.
In fault-tolerant computer systems, modulo adders are used for implementing residue, inverse residue, and product (AN) arithmetic codes. In low-cost implementations of systems for handling such codes, modulo 2n−1 adders are used both in encoding and to implement various arithmetic operations on the encoded operands.
An important part of designing any hardware adder is designing the circuitry that performs the carry computation and generation operation. The primary objective is speed, and that can be attained by reducing the number of inputs to the gates, reducing the maximum fan-out of the circuit, and avoiding elements that make the circuit into an asynchronous sequential circuit. A secondary objective is regularity of circuit structure, which vastly improves the testability and performance of the design and provides bounded signal propagation delays from inputs to outputs and thereby reduces design time and cost.
Ways of designing the carry circuit include traditional end-around carry schemes, carry look-ahead adders, and schemes which treat carry generation in binary addition as a prefix problem.
Where the prefix computation is done in parallel, the result is a parallel-prefix adder. The 2n−1 adder disclosed herein is a parallel-prefix adder.
In prefix problems generally, n inputs (suppose xn−1,xn−2, . . . ,x0) and an associative operator “o” are used for computing n outputs (suppose yn−1,yn−2, . . . ,y0) according to the relationyi=xioxi−1o . . . ox0 for i=0, . . . ,n−1.
Carry computations can be treated as prefix problems by using the following associative operator o, where g is the carry generate term and p is the carry propagate term:(gm,pm)o(gk,pk)=(gm+pm·gk,pm·pk)
Note that o is not a commutative operator, since its left argument is treated differently from its right argument.
Next, g and p need to be defined in terms of the inputs to the adder circuits. Let an−1an−2 . . . a0 and bn−1bn−2 . . . b0 denote two 12-bit input operands. Then the carry generate term gi and the carry propagate term pi are defined for i=0,1, . . . ,n−1 as:gi=ai·bipi=ai+bi
Notice that pi could also be defined as pi=ai⊕bi, with ⊕ representing the exclusive OR operation. With these definitions of g and p, the carry bit ci for each bit position i obeys the relation ci=Gi, where
                              (                                    G              i                        ,                          P              i                                )                =                  {                                                                                          (                                                                  g                        0                                            ,                                              p                        0                                                              )                                    ,                                                                                                                        if                      ⁢                                                                                          ⁢                      i                                        =                    0                                    ,                                                                                                                                                (                                                                        g                          i                                                ,                                                  p                          i                                                                    )                                        ⁢                                          o                      ⁡                                              (                                                                              G                                                          i                              -                              1                                                                                ,                                                      P                                                          i                              -                              1                                                                                                      )                                                                              ,                                                                                                  if                    ⁢                                                                                  ⁢                    1                                    ≤                  i                  ≤                                      n                    -                    1.                                                                                                          (        1        )            
After the carry ci has been computed as set forth above, the sum bits si for the results of the addition can be computed as:hi=ai⊕bi,si=hi⊕ci−1.
Notice that by definition c−1=cn−11.
The parallel prefix adders which are the subject of the present discussion can be represented as shown in FIG. 1. In both FIG. 1 and FIG. 2, parentheses are used in place of subscripts. Thus, c(i) is equivalent to ci. The adder is represented as a directed acyclic graph, where the shape of each node of the graph indicates a logic operator. The node performs the operation on its inputs that is indicated by the operator. The operators of interest are indicated at the top of the figure. Thus, a square node represents logic operator 101; a black circle represents logic operator 103; and a diamond represents logic operator 105.
Any structure that implements a prefix adder which does not receive a carry input (or equivalently, the input carry c(in) is 0) can be represented as shown at 107. Each of the nodes at position i at 109 receives 1 bit from each of the operands, a(i) and b(i), and performs the first computation step involved in a binary addition operation on the operands. The result at location i in row 109 is the output h(i), which indicates the value at that bit position resulting from the application of operands a(i) and b(i) to the logic operators at row 109, the output g(i), which indicates whether a carry of 1 is to be generated, and the output p(i), which indicates whether the carry is to be propagated. These outputs go to prefix structure 111, which is a tree structure that does the parallel carry computation. Details of the prior-art tree structures can be found at R. Zimmerman, “Binary adder architectures for Cell-based VLSI and their Synthesis”, Ph.D. Thesis, Swiss Federal Institute of Technology, Zurich, 1997, available at http://www.iis.ee.ethz. Prefix structure 111 computes the carry value c(i) for each bit position from the g(i)'s and p(i)'s produced by row 109 and outputs it to row 113. The h(i)'s computed at row 109 are also inputs to row 113. At 107 the h(i) inputs to row 113 are represented by dotted lines. Row 113 then produces as its output the result, s(i) for each bit position i.
If a 2n parallel prefix adder is to receive a carry input c(in), it can be modified as shown at 115: an extra stage of logic operators 117 is added which receives not only the c(i) outputs produced by prefix structure 111 at each bit position i, but also the carry value c(in). Further, as shown in R. Zinunerman, “Efficient VLSI implementation of Modulo (2n±1) Addition and Multiplication”, Proc. 14th IEEE Symp. Computer Arithmetic, pp. 158–167, April, 1999, a 2n parallel prefix adder 115 can be transformed into a modulo 2n−1 adder by using the Gn−1 result from the prefix structure as c(in) to stage 117. Both versions of adder 115 operate in two cycles: in the first cycle, a regular addition takes place. During the second cycle, c(in) is added to the result c(n−1), c(n−2), . . . , c(1), c(0) produced by the prefix structure in the first cycle. Disadvantages of adder 115 include the two-cycle operation, the extra logic stage 117, and the fact that c(in) has a fan-out of ii. It is an object of the invention disclosed herein to overcome these and other disadvantages of existing modulo 2n−1 adders.