1. Field of the Invention
This invention pertains generally to parallel leading one/zero anticipation algorithms. More particularly, the invention is an optimized system and method for a parallel leading one/zero anticipation in floating-point addition which ascertains xe2x80x9cend of runxe2x80x9d patterns in parallel, rather than using carry-lookahead scheme.
2. The Prior Art
In floating-point addition, the result of an operation may require a left shift during normalization, as is known in the art. Normalization is normally carried out using leading one/zero detection (LOZD) or leading one/zero anticipation (LOZA). LOZD and LOZA are described in further detail in the technical paper entitled xe2x80x9cLEADING ONE PREDICTIONxe2x80x94IMPLEMENTATION, GENERALIZATION, AND APPLICATIONxe2x80x9d by Nhon Quach and Michael J. Flynn, published by Stanford University, March 1991, which is incorporated herein by reference.
FIG. 1 depicts the normalization process using an LOZD algorithm. In FIG. 1, the LOZD unit 1 receives the result of the addition from ADDER unit 2, and then performs a leading one/zero detection by counting the number of preceding zero""s or one""s in the result. This number is then used to drive a SHIFTER unit 3 to produce the final normalized result. Because the detection of the leading one or zero is not carried out until the result is calculated by the ADDER 2, the normalization with LOZD is a slow process.
FIG. 2 depicts the normalization process using an LOZA algorithm. In FIG. 2, the LOZA unit 4 calculates the number of preceding zero""s or one""s directly from input operands, rather than from the calculated result from ADDER unit 2. Thus the LOZA unit 4 carries out the prediction operation in parallel with the addition operation by ADDER 2, yielding a faster overall process than the LOZD process described in FIG. 1.
LOZA algorithms are generally based on a bit pattern detection framework. More particularly, LOZA algorithms detect bit patterns in a string generated from the operands for the floating-point addition. Where ai and bi are the ith bit of the input operands A and B, respectively, the LOZA algorithm generates the string according to the following formulas:
Ti=ai⊕bi(exclusive or),
Zi={overscore (aivbi)}(NOR),
Gi=aibi(AND).
Thus for operands A=11110001 and B=00010000, the generated string is TTTGZZZT or T3GZ3T, where Ti denotes a string of T""s of length i. According to LOZA algorithms, only the following bit pattern will produce a string of preceding zero""s:
T*GZ*(STRING 1),
where T* denotes a string of any number of T""s (including the empty string).
Likewise, only the following bit pattern will produce a string of preceding one""s:
T*ZG*(STRING 2).
Thus, LOZA algorithms are configured to detect the above described string patterns (STRING 1 and STRING 2) for producing leading zero""s and one""s, and to ascertain a xe2x80x9cleft shift signalxe2x80x9d if any of the patterns are found. The left shift signal indicates the number of factors to left shift the resulting value.
It is preferred that the time for the LOZA 4 process is equal to or less than the time for the ADDER 2 process to thereby enable the LEFT SHIFTER UNIT 3 to carry out its operation as soon as ADDER 2 completes. As such prior act implementations of LOZA 4 units have commonly employed a scheme similar to the ADDER 2, namely a parallel carry lookahead (CLA) scheme. An illustrative LOZA scheme is provided in the Nhon Quach paper noted above, entitled xe2x80x9cLEADING ONE PREDICTIONxe2x80x94IMPLEMENTATION, GENERALIZATION, AND APPLICATIONxe2x80x9d published by Stanford University, March 1991. Under this CLA scheme, the LOZA algorithm is carried out in log n steps.
Accordingly, there is a need for an optimized system and method for detecting leading zero""s and one""s which does not require carry lookahead implementation and improves the speed of the detection process. The present invention satisfies these needs, as well as others, and generally overcomes the deficiencies found in the background art.
The present invention is an optimized system and method for anticipating leading zero""s and one""s in a floating point-addition of two operands. In general, the operands of the floating-point addition are represented by a xe2x80x9cTGZxe2x80x9d string according to conventional LOZA analysis, as described above, wherein:
xe2x80x83Ti=ai⊕bi(exclusive or),
Zi={overscore (aivbi)}(NOR),
Gi=aibi(AND).
Thus for operands A=11110001 and B=00010000, the string representing the operands (A and B) is TTTGZZZT or T3GZ3T, where Ti denotes a string of T""s of length i. The present invention operates on this string to anticipate a count of leading one""s and zero""s (LOZA) as described herein.
According to a first embodiment of the invention, the method for generating a count of leading zero""s and one""s in a floating-point addition of two operands, the two operands represented by a xe2x80x9cTGZxe2x80x9d string, the method comprising separating the TGZ string into a plurality of nibbles, each having at least bits identified as bit 0 and bit 1; inspecting nibble data corresponding to each nibble for an end-of-run pattern to determine if the nibble has an end of run; identifying bit position of the end of run within each nibble having an end-of-run; identifying the most significant nibble having an end-of-run from the plurality of nibbles; and correlating position of the most significant nibble having an end-of-run with the corresponding bit position of the end-of-run for the most significant nibble to generate a count of leading one""s and zero""s.
According to another embodiment of the invention, the system comprises a plurality of nibble logic units, each configured to inspect nibble data corresponding to each nibble for an end-of-run pattern to determine if the nibble has an end of run, each nibble logic unit further configured to identify the bit position of the end of run for each nibble having an end-of-run; a priority encoding unit operatively coupled to each nibble logic unit, the priority encoding unit configured to identify a most significant nibble having an end-of-run from the plurality of nibbles; and a multiplexer unit operativley coupled to each nibble logic unit and to the prioirty encoding unit, the multiplexer unit configured to correlate the most significant nibble having an end-of-run with corresponding bit position of the end of run for the most significant nibble.
The invention further relates to machine readable media on which are stored embodiments of the present invention. It is contemplated that any media suitable for retrieving instructions is within the scope of the present invention. By way of example, such media may take the form of magnetic, optical, or semiconductor media. The invention also relates to data structures that contain embodiments of the present invention, and to the transmission of data structures containing embodiments of the present invention.