Carry look-ahead (CLA) addition has become one of the more popular techniques for implementing fast adders in data processors. The popularity of designs employing CLA addition stems largely from their modularity, minimal area, and high speed advantages. The recursive method of CLA addition is well known. In conventional CLA designs, fan-in limits carry look-ahead to groups of four bits, therefore, multi-level look-ahead structures are used for adders with larger word sizes (e.g. 64-bit). As the word size of adders increases, however, the conventional carry-chain delay limits the cycle time (minimum time required to perform the CLA addition). Thus, for high performance data processors, implementation of fast adders requires the minimization of the carry-chain latency.
Although several techniques have been employed to minimize the carry chain latency, generally these designs require additional silicon area for their implementation. In one such technique a 32-bit CMOS adder is implemented in multiple output domino logic (MODL) to compensate for the carry-chain delay. In this design, the carry outputs are generated in parallel; however, the MODL design operates on 2-bit groups rather than 4-bit groups. Thus, although the MODL adder is faster than a classic CLA design, the MODL requires more silicon area due to the non-Manchester design style. In another technique, a CLA design does parallel group generates from the input operands, but uses a conditional-sum design approach, which once again is hardware intensive. Typically, previous CLA designs have not minimized the carry chain latency without increasing the silicon area required for implementation of the CLA design.