1. Technical Field
The present invention relates in general to processors in data processing systems and in particular to normalization of floating point numbers for manipulation by such processors. Still more particularly, the present invention relates to improving normalization and associated exponent adjustment to improve overall performance of the processor in the data processing system.
2. Description of the Related Art
Processors in data processing systems are frequently required to manipulate data expressed in scientific notation or "floating point" numbers, which include a mantissa and an exponent. For a variety of reasons, including data precision, implementation considerations, and industry convention, processors frequently require floating point numbers expressed in binary format to include a 1 in its leading digit or most significant bit. Thus, processors typically manipulate floating point numbers expressed in the form EQU n=-1.sup.s X1.fX2.sup.exp ( 1)
where n is the number, s represents the sign of the number, f is the fractional component of the mantissa, and exp is the exponent of the radix.
Processors which manipulate floating point numbers typically include a normalizer, which receives floating point numbers and adjusts the mantissa and exponent so that the mantissa has a one in its leading digit. The normalization process involves (1) finding the leading one in a floating point number's mantissa, (2) shifting the mantissa until the leading one is in the most significant bit position, and (3) adjusting the exponent to compensate for the shift of the mantissa. Normalization is most commonly performed in the context of adjusting the result of an arithmetic operation involving floating point numbers.
Known normalizers require a high degree of serialized work, degrading processor performance. One known normalizer uses a leading zero anticipator (LZA) capable of predicting the "nibble," or group of four bits, in a result which will contain the leading one. The accuracy of the LZA prediction in this known normalizer is such that the error will only involve one bit position, and always in the direction of the next least significant bit within the mantissa. The LZA generates shift selects for both HEX2 (shift by groups of 16 bits) and HEX1 (shift by groups of 4 bits) shifting multiplexers. Once the HEX1 output data is available, the first four bits of the shifted mantissa may be examined using a leading zero detector (LZD) to determine the exact position of the leading one. The LZD employed by this known normalizer is relatively small in size and may be referred to as a "mini LZD." However, because the resolution of the LZA is only to a nibble, the mini LZD employed by this known normalizer requires four input AND gates to examine the four most significant bits of the HEX1 shifting multiplexer output.
Once the position of the leading one is ascertained, shift selects for a binary (shift by 0, 1, 2, 3, or 4 bits) shifting multiplexer and the exponent adjust may be computed. The ability to shift by 4 bits is required to accommodate the possible one bit position error in the LZA prediction, which will always occur in the same direction.
Examination of this known normalization approach from a timing perspective reveals the serialized nature of normalization. The LZA computation of shift selects may be done in parallel with the main add mechanism. However, normalization cannot be started until after the main add since the output from the main adder forms the input to the HEX2 shifting multiplexer. Once through the HEX2 shifting multiplexer, the data proceeds serially through the HEX1 shifting multiplexer and the LZD. The LZD determines the binary shift and the exponent adjust required. At this point, processing of the mantissa and exponent components diverges into separate paths. The mantissa is forwarded to the binary shifting multiplexer and then to the rounder. The exponent adjust is subtracted from the intermediate exponent to form the adjusted exponent, at which time exponent range checking and final IEEE adjustments may proceed. Thus, the output of the main adder in the known normalizer described follows a long, highly serialized path to the final floating point result.
A second known normalizer, described in U.S. Pat. No. 5,392,228, employs a different LZA/LZD combination to perform normalization. The LZA employed has an accuracy of +/- one bit position, and thus can nearly determine the position of the leading one before the add result is available. Once the main add result is available, a mask is provided from the LZA output which is ANDed with the add result to provide an accurate determination of the leading one position. This result is passed to both the shifting multiplexers and the exponent adjustment logic to compute the final floating point result. The circuits for producing the mask, ANDing the mask with the add result, and computing the leading one position may be referred to as a "big LZD." This arrangement requires significantly more area to implement than the known normalizer previously described.
Once the position of the leading one is accurately determined, processing of the mantissa and exponent components separates. The mantissa passes serially through the HEX2, HEX1, and binary shifting multiplexers. The exponent adjust is subtracted from the exponent, followed by exponent range checking and final IEEE adjustments.
Despite the increased size, the approach followed by the second known normalizer has several beneficial features. Conceptually, the approach is straightforward. The exact position of the leading one is known from the LZA/LZD combination. At this point, the second known normalizer is in a position similar to that of the first normalizer when the output of the mini LZD is available. However, the second normalizer must perform the HEX2 and HEX1 shifts as well the binary shift, while the first normalizer need only perform the binary shift.
An additional benefit of the second normalizer's approach is that computation of the adjusted exponent may be performed in parallel with the multiplexer shifting of the mantissa. This advantage is significant since the delay through an adder is considerably longer than the delay through a multiplexer. Even the small adder required to compute the adjusted exponent may require as long as the multilevel multiplexing required to shift the mantissa. For a timing comparison of the two normalizers described, the following delays are assumed:
shifting multiplexer=1 unit; PA1 mini LZD=1 unit; PA1 big LZD=3 units; PA1 exponent adder=3 units.
From the time the main adder result becomes available until the adjusted exponent is determined, the first normalizer requires processing through the HEX2 shifting multiplexer (1 unit), the HEX1 shifting multiplexer (1 unit), the mini LZD (1 unit) and the exponent adder (3 units) for a total delay of 6 units. During the same period, the second normalizer requires processing through the big LZD (3 units), and then in parallel through both: (a) the HEX2 shifting multiplexer (1 unit), the HEX1 shifting multiplexer (1 unit) and the binary shifting multiplexer (1 unit); and (b) the exponent adder (3 units). Again, the total delay is 6 units. Thus the second normalizer has approximately the same delay as the first normalizer, but requires more overall area to implement.
It would be advantageous to avoid the delay and size of a large LZD. It would also be desirable to avoid waiting until after the HEX2 and HEX1 shifts to begin computing an intermediate exponent, not only due to the long delay of the adder but also because this delays the start of exponent range checking and IEEE adjustments.