1. Field of the Invention
The present invention relates to the field of data processing. In particular, the invention relates to an apparatus for performing a fused multiply add floating point operation.
2. Description of the Prior Art
Processors for performing arithmetic operations on floating point numbers are known. In floating point representation, numbers are represented using a significand 1.F, an exponent E and a sign bit S. The sign bit S represents whether the floating point number is positive or negative, the significand 1.F represents the significant digits of the floating point number, and the exponent E represents the position of the radix point (also known as a binary point) relative to the significand. By varying the value of the exponent, the radix point can “float” left and right within the significand. This means that for a predetermined number of bits, a floating point representation can represent a wider range of numbers than a fixed point representation (in which the radix point has a fixed location within the significand). However, the extra range is achieved at the expense of reduced precision since some of the bits are used to store the exponent. Sometimes, a floating point arithmetic operation generates a result with more significant bits than the number of bits used for the significand. If this happens then the result is rounded to a value that can be represented using the available number of significant bits.
FIG. 1 of the accompanying drawings shows how floating point numbers are stored within a register or memory. In a single precision representation, 32 bits are used to store the floating point number. One bit is used as the sign bit S, eight bits are used to store the exponent E, and 23 bits are used to store the fractional portion F of the significand 1.F. The 23 bits of the fractional portion F, together with an implied bit having a value of one, make up a 24-bit significand 1.F. The radix point is initially assumed to be placed between the implied bit and the 23 stored bits of the significand. The stored exponent E is biased by a fixed value 127 such that in the represented floating point number the radix point is shifted left from its initial position by E-127 places if E-127 is negative (e.g. if E-127=−2 then a significand of 1.01 represents 0.0101), or right from its initial position by E-127 places if E-127 is positive (e.g. if E-127=2 then a significand of 1.01 represents 101). The bias is used to make it simpler to compare exponents of two floating point values as then both negative and positive shifts of the radix point can be represented by a positive value of the stored exponent E. As shown in FIG. 1, the stored representation S[31], E[30:23], F[22:0] represents a number with the value (−1)S*1.F[22:0]*2(E-127). A single-precision floating point number in this form is considered to be “normal”. If a calculated floating point value is not normal (for example, it has been generated with the radix point at a position other than between the left-most two bits of the significand), then it is normalized by shifting the significand left or right and adjusting the exponent accordingly until the number is of the form (−1)S*1.F[22:0]*2E-127. Exception handling routines are provided to handle numbers that cannot be represented as a normal floating point value.
A double precision format is also provided in which the significand and exponent are represented using 64 stored bits. The 64 stored bits include one sign bit, an 11-bit exponent and the 52-bit fractional portion F of a 53-bit significand 1.F. In double precision format the exponent E is biased by a value of 1023. Thus, in the double precision format a stored representation S[63], E[62:52], F[51:0] represents a floating point value (−1)S*1.F[51:0]*2E-1023.
Hereafter the present invention shall be explained with reference to the double precision floating point format. However, it will be appreciated that the invention could also be applied to the single precision format (or any other floating point format) and that the bit values shown in subsequent Figures could be replaced by values appropriate to the floating point format being used.
One commonly used floating point operation is a multiply add operation A+B*C, whereby two operands are multiplied together and the product of those two operands is added to a third operand. The multiply add operation is also known as a multiply accumulate operation. It is possible to implement a multiply add operation using independent multiply and add units operating in succession. This approach, known as a split chained multiply add, performs floating point rounding on two occasions (once during the multiply operation and once during the add operation). Each rounding step results in a loss of precision and so the split chained multiply add can produce inaccurate results (especially when calculating quantities such as reciprocals which may be irrational). The split chained multiply add operation is also relatively slow, and so most floating point units instead provide a specialized fused multiply add unit that performs the multiply add operation as an atomic operation.
FIG. 2 of the accompanying drawings shows an example of a fused multiply add unit, such as the one proposed by Montoye et al in U.S. Pat. No. 4,969,118. The fused multiply add unit receives three operands A, B, and C and processes the operands to output the multiply accumulate result A+B*C. Addition of the operand A and the product B*C requires alignment of the significands of the operand A and the product B*C such that their exponents are the same. To align the significands, the fused multiply add unit carries out shifting of the operand A in parallel with computation of the product B*C. Since some of the add processing is performed at the same time as the multiply processing, the fused multiply add unit can perform a multiply add operation quicker than a split chained multiply add unit. The fused multiply add unit is also more accurate than the split chained multiply add because it performs rounding only on the final result and not on the intermediate product B*C. The present invention seeks to make further improvements to the fused multiply add unit.