1. Field of the Invention
Embodiments of the present invention relate to techniques for performing mathematical operations in a computer system. More specifically, embodiments of the present invention relate to a technique for efficiently estimating the position of the leading zero or the leading one in the result of a floating-point ADD operation.
2. Related Art
The floating-point ADD operation is performed with hardware support in most computer systems. Because floating-point ADD operations include several separate constituent operations, the floating-point ADD operation is one of the slower mathematical operations on a computer system. Computer system designers have attempted to improve the performance of the floating-point ADD operation by parallelizing its constituent operations. To this end, some computer system designers have included a leading zero anticipator (LZA) (sometimes called a “leading zero estimator”) in floating-point circuitry to perform the floating-point ADD operation more efficiently.
LZAs predict the location of the leading (most significant) zero or the leading one bit of the result of a floating-point ADD operation in parallel with the ADD operation. More specifically, the LZA estimates the position of the left-most zero bit in the result of the addition if the result is negative or the left-most one bit in the result if the result is positive. The estimate is then used to shift the mantissa of the result when normalizing the result following the floating-point ADD operation.
In order to estimate a location for the leading zero or the leading one, some LZAs compute a “propagate bit,” a “generate bit,” and a “kill bit” for each separate bit position i. (Note that the index i increases from left to right.) These bits can be denoted as T, G, and Z where T is a propagate bit, G is a generate bit, and Z is a kill bit. Assuming that two terms, A and B, are to be added together (where Ai denotes the ith bit in A and Bi denotes the ith bit in B), T, G and Z can be determined for each bit position i in A and B as follows:Ti=Ai XOR Bi;Gi=Ai AND Bi; andZi=NOT(Ai OR Bi), where “NOT” represents a logical inversion.Moreover, T, G and Z can be used to compute the location of the leading zero or the leading one using the expressions,f0=NOT(T0) AND Ti, andf=Ti−1 AND ((Gi AND NOT(Zi+1)) OR (Zi AND NOT(Gi+1))) OR NOT(Ti−1) AND ((Zi AND NOT(Zi−1)) OR (Gi AND NOT(Gi+1))), where i>0.In the expression above for fi, if fi is equal to one (i.e., the “indicator” is “set”) for a given position and no other position of greater significance has its indicator set, then the leading digit is at either i or i+1.
Along with computing fi, a few common LZA techniques are described in “Leading Zero Anticipation and Detection—A Comparison of Methods” by Martin S Schmookler and Kevin J Nowka, IEEE 2001, 0-7695-1150-3/01, page 8 (hereinafter “Schmookler”). In this paper, Schmookler describes how, for positive results of a floating-point addition, the first (from left to right) occurrence of Ti−1 XOR NOT(Zi)=1 provides the index of i such that the leading one is in either location i−1 or i (for leading-zero detection). Although not described in Schmookler, in a limited number of cases where the floating-point addition generates a negative result, the first occurrence of Ti−1 XOR NOT(Zi)=1 also provides the index of i such that the leading zero bit is in location i or i+1 (for leading-one detection).
Generally, common high-efficiency LZA implementations require a dozen gates or more in each bit position to compute the estimate for all cases (i.e., for both positive and negative results). Because the operands can include 32, 64, or more bits, the LZA can require a significant amount of integrated circuit area and can consume a significant amount of power.
Hence, what is needed is an LZA which is more efficient than the above-described LZAs.