Two common functions used in the design of logic circuits are:fN=x1 OR(x2 AND(x3 OR(x4 AND . . . xN . . . )))  (1)andf′N=x1 AND(x2 OR(x3 AND(x4 OR . . . xN . . . ))).  (2)These functions are important because they concentrate almost all delay of most frequently used operations, such as addition, subtraction and comparison of (N/2)-bit numbers. Consequently, faster implementation (often referred to as “small depth”) of these functions will result in faster adder, subtractor and comparator logic circuits on the resulting integrated circuit (IC) chip. As used herein (unless the context clearly indicates otherwise), “addition” operation includes subtraction operations and “adders” includes subtractors.
To illustrate the effect of these functions, an addition operation of two binary numbers, A and B, may be expressed as:                                                                         a                ⁡                                  [                  1                  ]                                                                                    a                ⁡                                  [                  2                  ]                                                                                    a                ⁡                                  [                  3                  ]                                                                    ⋯                                                                        a                  ⁡                                      [                    N                    ]                                                  ⁢                                                                  +                                                                                                                                              b                  ⁡                                      [                    1                    ]                                                                                                b                  ⁡                                      [                    2                    ]                                                                                                b                  ⁡                                      [                    3                    ]                                                                              ⋯                                                              b                  ⁡                                      [                    N                    ]                                                                                                          s              ⁡                              [                0                ]                                      ⁢                                                                                s                    ⁡                                          [                      1                      ]                                                                                                            s                    ⁡                                          [                      2                      ]                                                                                                            s                    ⁡                                          [                      3                      ]                                                                                        ⋯                                                                      s                    ⁡                                          [                      N                      ]                                                                                                                ,where a[1], a[2], a[3], a[4], . . . , a[N] are the bits of operand A, b[1], b[2], b[3], b[4], . . . , b[N] are the bits of operand B, and index 1 corresponds to the most significant bit. The resultant, S, is expressed as bits s[0], s[1], s[2], s[3], . . . , s[N], where index 0 corresponds to the most significant bit. This operation can be expressed ass[i]=a[i]+b[i]+c[i+1](mod 2),where c[i] is the i-th carry bit, i>0, and s[0]=c[1]. The value of c[i] will equal 1 when at least one of the following conditions occurs:                1) (a[i]=1 and b[i]=1);        2) (a[i]=1 or b[i]=1) and                    (a[i−1]=1 and b[i−1]=1);                        3) (a[i]=1 or b[i]=1) and                    (a[i−1]=1 or b[i−1]=1) and            (a[i−2]=1 and b[i−2]=1);                        4) (a[i]=1 or b[i]=1) and                    (a[i−1]=1 or b[i−1]=1) and            (a[i−2]=1 or b[i−2]=1) and            (a[i−3]=1 and b[i−3]=1);                        etc.If x[i] is substituted for (a[i]=1 and b[i]=1) and y[i] is substituted for (a[i]=1 or b[i]=1), the value of c[i] can be expressed as:        c[i]=x[i] or                    (y[i] and x[i−1]) or            (y[i] and y[i−1] and x[i−2]) or            (y[i] and y[i−1] and y[i−2] and x[i−3]) or            . . .            (y[i] and y[i−1] and y(i−2] and . . . and                            y[N−1] and x[N]).Transformation of the above leads to                                                c[i]=x[i] OR (y[i] AND (x[i−1] OR (y[i−1] AND . . . AND (x[N−1] OR (y[N−1] AND x[N])) . . . ))),which is an expression of addition in the form of function (1).        
In a comparison operation, A>B is true in the following cases:                1) if a[1]>b[1];        2) if a[1]=b[1] and a[2]>b[2];        3) if a[1]=b[1] and a[2]=b[2] and                    a[3]>b[3];                        etc.(If a greater or equal function (“≧”) is used in place of the equality function (“=”), the result will be the same, except that the steps are not mutually exclusive.)        
If x[i] is substituted for a[i]>b[i] and y[i] is substituted for a[i]=b[i] or a[i]≧b[i], as the case may be, then A>B can be expressed as                x[i] OR (y[i] AND (x[i−1] OR (y[i−1] AND . . . AND                    (x[N−1] OR (y[N−1] AND x[N])) . . . ))),which is an expression of comparison also in the form of function (1).                        
One well-known technique for implementing functions (1) and (2) is illustrated in FIG. 1 using N−1 binary gates 100a, 100b, 100c, . . . , 100n. The advantage of this circuit is that it uses the least possible number of gates. The disadvantage of the circuit of FIG. 1 is that it also has a depth of N−1.
The least possible depth for functions (1) and (2) is (log2 N) (1+o(1)) with not more than const*N gates. See V. M. Krapchenko, “Asymptotic Estimations of Addition Time of Parallel Adder”, Syst. Theor. Res. 19, pp 105–122 (1970) and M. Minimair, “Design, Analysis and Implementation of an Adder”, Ladner & Fisher (1994). However, implementation of this technique is more of a theoretical interest because it becomes efficient only for very large values of N, namely where N is about 3,000 or more. Moreover, for small or average values of N, such as where N is less than about 3,000, this technique requires significantly more than about 1.5N gates for implementation.
An efficient technique for small and moderate values of N (less than about 3,000) is described in U.S. application Ser. No. 10/017792 for “Optimization of Adder Based Circuit Architecture” and U.S. application Ser. No. 10/017802 for “Optimization of Comparator Architecture” both filed Dec. 12, 2001 by Sergej B. Gashkov, Alexander E. Andreev and Aiguo Lu and assigned to the same assignee as the present invention. The Gashkov et al. applications employ a “building block” 102 consisting of two-input AND and OR gates 104a, . . . , 104c in the form shown in FIG. 2. Function (1) is implemented as an unbalanced binary tree 106 composed of blocks 102a, . . . , 102n, as shown in FIG. 3. Function (2) is implemented by a similar netlist, except the AND gates are replaced with OR gates and vice versa. For an N-input function fN the Gashkov et al. implementation requires approximately 1.5N binary gates, and has the depth of logt N, where “t” is the “golden ratio” of approximately 1.618. Accordingly, the depth of circuits employing the Gashkov et al. technique is about 1.44 log2 N. (A balanced tree implementation would require the same complexity, but would increase the depth to 2 log2 N.)