1. Technical Field
The present invention relates generally to an improved data processing system and method. In particular, the present invention provides a system and method for handling denormal floating point operands, in a binary floating point unit which executes “fused-multiply-add/subtract” instructions, when the result must be normalized.
2. Description of Related Art
To increase the speed and efficiency of real-number computations, floating point execution units in typical computers represent real numbers in a binary floating point format. In this format, a real number has three parts, a sign, a mantissa, and an exponent. The sign is a binary value that identifies whether the number is positive or negative. The mantissa is the numeric value which is multiplied by a base or radix raised to the power of the exponent, e.g., the mantissa of 145,000 is 145 (145×10^3). The mantissa is represented as a one bit binary integer and a binary fraction. The one bit binary integer is often not represented but is instead an implied value. The exponent is a binary integer that represents the base-2 power that the mantissa is raised to.
In most cases, the floating point execution unit represents real numbers in normalized form. This means that, except for zero, the mantissa is always made up of an integer of 1 and the fraction 1.f... ff. For example, the normalized mantissa of the single precision representation for the ordinary decimal number 178.125 is represented by the floating point execution unit as 01100100010000000000000 (with the “1.” implied). For values less than 1, leading zeros are eliminated. For each leading zero that is eliminated, the exponent is decremented by one, resulting in an exponent with a negative value.
The floating point execution unit represents exponents in a biased form. This means that a constant is added to the actual exponent so that the biased exponent is always a positive number or zero, even when its value is negative. The value of the biasing constant depends on the number of bits available for representing exponents in the floating point format being used, which depends upon which precision is used. The biasing constant is chosen so that the smallest normalized number can be reciprocated without overflow. In the above example, the biased single precision exponent for the decimal number 178.125 is represented as 10000110. Thus, in scientific notation, the number 178.125 is the combination of the normalized mantissa and the biased exponent, i.e. 1.011001000E210000110.
Non-zero, finite numbers may be either normalized or denormalized numbers. The normalized finite numbers comprise all the non-zero finite values that can be encoded in a normalized real number format. When real numbers become very close to zero, the normalized number format can no longer be used to represent the numbers. This is because the range of the exponent is not large enough to compensate for shifting the binary point to the right to eliminate leading zeros.
When the biased exponent is zero, smaller numbers can only be represented by making the integer bit and other leading bits of the mantissa zero. The numbers in this range are called denormalized (or tiny) numbers. The use of leading zeros with denormalized numbers allows smaller numbers to be represented. However, this denormalization causes a loss of precision, i.e. the number of significant bits in the binary fraction is reduced by the leading zeros.
When performing normalized floating point computations, a floating point execution unit typically operates on normalized numbers and produces normalized numbers as a result. Denormalized numbers represent an underflow condition. In order to address various unusual conditions which may arise and which affect the accuracy of the results, status and control bits are defined. Status bits are defined for each of several unusual conditions called exceptions, which includes underflow, overflow, and division by zero exceptions. Corresponding control bits are provided to either enable or disable traps to handle these exceptions. If an exception occurs and the trap is enabled, an interrupt is taken to a program which analyzes the specific exception and takes appropriate action. When the trap is not enabled, the status bit is set and execution continues. The status bit remains set for possible examination at a later time.
If the underflow trap is not enabled (UE=0) and an underflow exception occurs, then the result is denormalized. If the trap is enabled (UE=1), however, then IEEE standard 754-1985, which defines real number operations and the representation of real numbers, requires that the final result be normalized, even though its value is below the range of normalized numbers. Since its corresponding biased exponent would be smaller than zero, it is adjusted by adding a constant value to the exponent, thus making it greater than zero. The program or procedure which subsequently handles the exception can then subtract the constant to determine its real value.
Denormal operands may be encountered when the floating point execution unit executes “fused-multiply-add” and “fused-multiply-subtract” instructions. In a fused-multiply-add or -subtract instruction, three operands are provided, operand A, operand B, and operand C. With known fused-multiply-add or fused-multiply-subtract instructions in a floating point pipelined dataflow, such as one found in a PowerPC™ microprocessor, operands A and C are multiplied and operand B is added to or subtracted from the unrounded product AC. When B is smaller than AC, as determined from their exponents, the mantissa of B is shifted right with respect to the AC mantissa. When B is greater than AC, then the mantissa of B is shifted left with respect to AC. However, the mantissa of B only needs to be shifted far enough to the left to avoid any overlap between the AC mantissa and the B mantissa and its guard bit, which ensures proper rounding of the result. For double precision instructions, the mantissa and its guard bit consist of 54 bits. Therefore, when the B exponent exceeds the AC exponent by more than 56, i.e. expB>expAC+56, then the B mantissa is shifted left only 56 bits with respect to AC, providing a space of just 2 bit positions between them.
However, a problem may arise when the underflow exception is enabled (UE=1), since the final result must be normalized. The problem occurs when the three operands are all very small, such that B is denormal, and AC is much smaller than the smallest denormal number. Such a case is very rare, partly because denormal values seldom occur, and also because the denormal value must have been either an input value to the program, or else produced when underflow exceptions were not enabled (UE=0).
In such a case, B is much greater than AC, and is therefore placed to its left. However, since it is denormal, it has leading zeros which must be removed. If the final result is normalized by shifting B and AC together to the left to remove the leading zeros, part of AC may be shifted into the lower order bits of the result. To get the proper result, B may need to be placed more than 56 bits left of AC to be properly aligned. For the extreme case where B consists of 52 leading zeros in its integer bit 7and fraction, with just a 1 in its least significant bit position (this value is the minimum denormal number), then B may need to be shifted left by up to 52 bit positions. This would make the output bus of the alignment shifter considerably wider, i.e. by 52 bits.
In known PowerPC™ implementations, a wider alignment shifter is avoided by first normalizing B when it is denormal if the underflow exception is enabled. The normal floating point processing is halted immediately while B is sent through the pipeline by itself and normalized prior to performing the multiply-add operation described above. The exponent is adjusted to compensate for normalizing the mantissa, but the corresponding exponent value is outside of the range that can be represented with the 11-bit field. Therefore, a wider exponent field, typically 13 bits, is used during execution to accommodate various intermediate results that are outside of the architected range.
Thus, in known systems, the processing of the pipeline must be halted temporarily to handle a denormal B operand. With the current trend to increase frequency of processors beyond what the improvements in processor and circuit technology would provide, it becomes more difficult to halt the pipeline. By the time the special condition is detected, a subsequent instruction is already on its way to the floating point unit.
Therefore, it would be beneficial to have an improved system and method for handling denormal floating point operands when the result must be normalized. More specifically, it would be beneficial to have a system and method that eliminates the need for either prenormalizing the input B operand or providing a wider alignment shifter and bus.