1. Technical Field
The present invention relates to binary adders, and more particularly to adders configured to optimize performance in designing and implementing logic for two operand binary addition in high-performance microprocessor systems based upon algorithms for adjusting parallel prefix graphs.
2. Description of the Related Art
Binary addition may be formulated as a parallel prefix problem. Inputs of the binary addition may include two operands, denoted as a and b, which are n-bit binary numbers. Outputs of the binary addition are two n-bit binary numbers s (sum) and c (carry). For a, b, s, and c, bit 0 is the least significant bit (LSB), and bit n−1 is the most significant bit (MSB).
Two n-bit intermediate signals carry propagate pi=ai*bi and carry generate gi=ai+bi are used to formulate binary addition as a parallel prefix problem. The prefix operation may be defined as follows: Gi:k+Pi:kGk−1:j=Gi:j; Pi:k Pk−1:j=Pi:j; where i≧k>j, Pi:i=pi and Gi:i=gi.
There are a number of solutions that address the parallel prefix problem. In many instances, these attempts do not offer flexibility to recover from poor decisions or provide a comprehensive solution stack (to explore several optimal solutions). Some drawbacks of known solutions include failure to provide a proposed solution for modifying the prefix graph to improve performance later in a tool flow when accurate timing information becomes available. In addition, posing the problem as a dynamic program requires constraining the prefix graph structure and significantly reduces the space of prefix graphs that can be explored by such an approach. For example, this approach cannot find a feasible solution when constraints on both logic levels on outputs and maximum fanout per node are specified.
Solving the problem with gate-sizing, buffering, and structured placement for a prefix structure using an Integer Linear Program (ILP) approach uses an abstract model for timing, area and power with no mention of choosing different prefix graph logic structures to improve the quality of the solution. A hierarchical scheme to improve sparsity of the prefix graph by rebalancing of fanout and wiring is specialized to a 64-bit adder and requires designer knowledge of gate/wire delays in a technology to converge to a good hierarchical solution. Methods that generate a continuum of hybrid prefix structures across the three dimensions of sparsity, fanout and radix do not provide a methodology that permits selecting a structure based on physical and technology constraints.
In summary, none of the existing solutions provide a plug-and-play infrastructure to address sub-optimalities introduced in a prefix graph structure due to abstract physical models that are employed to generate the prefix graphs. A new solution is needed to address abstract physical model inaccuracies, especially in deep sub-micron technologies. As a result of these inaccuracies, a synthesized design either does not meet timing requirements in high performance designs or consumes too much power when timing deficiencies, due to a poor choice of prefix structure, are compensated later in a flow using circuit parameters such as gate-sizing, threshold voltage optimization, supply voltage scaling, etc.