1. Field of the Invention
This invention relates in general to data processing systems and specifically to floating point units.
2. Description of the Related Art
Floating point units (FPUs) which execute floating point addition or floating point fused-multiply addition instructions may process a full-precision intermediate result in order to produce either a normalized result or a denormalized result in accordance with the IEEE 754 binary floating point standard. Normalization includes removing all non-significant bits (leading zeros) from the full-precision intermediate mantissa by left shifting and adjusting the exponent by subtracting the number of leading zeros removed. Denormalization may be required when the exponent of the normalized result is less than the minimum allowed exponent value Emin and underflow is disabled. Denormalization may include prepending non-significant bits (leading zeros) to the full-precision intermediate mantissa by right shifting and adjusting the exponent by adding the number of leading zeros prepended to the mantissa until the exponent equals Emin. Thus, for both normalization and denormalization, the exponent is adjusted by subtracting the normalization/denormalization shift count which may be positive (indicating a left shift) or negative (indicating a right shift).
This normalization/denormalization processing may be performed by the normalizer. The full-precision intermediate result in FPUs is typically produced by a sign-magnitude carry propagate adder. In some FPUs, the adder and normalizer are contained in separate pipeline stages called the addition stage and normalize stage respectively. Furthermore, in some FPUs, the shift count for the normalizer is calculated in parallel with the adder in the addition stage with the use of a leading zero anticipator (LZA).
There are two main methods used in FPUs for normalizing or denormalizing a full-precision intermediate result. The first method is the “brute-force” method in which the LZA and normalizer are the full-width of the full-precision intermediate mantissa result. In this context, the “width” of the normalizer refers to the maximum shift that it can accommodate. The brute-force design can be fully pipelined, but for high speed designs the full-width normalizer may require two or more pipeline stages. To support denormalized results, the shift count must be limited or clamped prior to commencing the shift otherwise the design cannot be fully-pipelined without stalling. The major advantage of the brute-force method is that the design can be fully-pipelined without the need for any pipeline stalls. The major disadvantages of the brute-force method are the high area requirements of the full-width LZA and full-width normalizer and the increased delay through the normalization stage and LZA. The increased delay is equivalent to increased latency in a highly-pipelined, high-speed design.
The second method is the “iterative” method in which a reduced-width normalizer and LZA are used and the mantissa is fed through the normalizer a variable number of iterations depending on the position of the leading significant bit and depending on whether a normalized or denormalized result is required. This requires that the pipeline be stalled during the iterations. For normalization, the maximum number of iterations is given by the full-width of the un-normalized full-precision intermediate mantissa divided by the width of the normalizer. In this context, the “width” of the normalizer refers to the maximum shift that it can accommodate. The LZA determines the shift count for the first pass through the normalizer. A leading zero detector (LZD) in parallel with the normalizer is used to determine the shift count for subsequent iterations if they are required. To produce a denormalized result an additional pass through the normalizer can be used or the shift count can be clamped. The major advantages of the iterative method are the reduced area requirements and the reduced delay through the normalizer stage and LZA. The delay through the LZA is reduced since the LZA is a serial operation and therefore a reduction in width results in a reduction in delay. The major disadvantages are the need to iterate a variable number of times through the normalizer and the need to stall preceding pipeline stages if more than one pass through the normalizer is required.
Some designs have achieved denormalized results with the use of separate denormalization units. This requires even more hardware, complicates the instruction issuing and scheduling and has a detrimental impact on performance.
What is needed is an improved floating point unit.
The use of the same reference symbols in different drawings indicates identical items unless otherwise noted.