1. Field of the Invention
The present invention relates to a data processing apparatus and method for performing floating point multiplication, and in particular to a data processing apparatus and method for multiplying first and second n-bit significands of first and second floating point operands to produce an n-bit result.
2. Description of the Prior Art
A floating point number in a defined normal range can be expressed as follows:±1.x*2y 
where: x=fraction                1.x=significand (also known as the mantissa)        y=exponent        
Floating-point multiplication consists of several steps:    1. Evaluating the input operands for special cases (in particular NaNs (Not-a-Number cases), infinities, zeros, and in some implementations subnormals). If a special case is detected, other processing may be required in place of the sequence below.    2. Adding the exponents. The product exponent is the sum of the multiplicand and multiplier exponents. The product exponent is checked for out of range conditions. If the exponent is out of range, a resulted is forced, and the sequence of steps below is not necessary.    3. The fractions are converted to significands. If the input operand was normal (as opposed to NaN, infinity, zero or subnormal) a leading ‘1’ is prepended to the fraction to make the significand. If the input operand is subnormal, a ‘0’ is prepended instead. Note that in alternative systems the subnormal operand may instead be normalized in an operand space larger than the input precision. For example, single-precision numbers have 8 bits of exponent, but an internal precision may choose to have 9 or more bits for the exponent, allowing single-precision subnormal operands to be normalized in such a system.    4. The n-bit significands are multiplied to produce a redundant set of 2n-bit vectors representing the 2n-bit product. This is typically done in an array of small adders and compressors.    5. The two 2n-bit vectors are summed to form a non-redundant final product of 2n-bits in length.    6. This final product is evaluated for rounding. The final result may only be n-bits. The lower bits contribute only to the rounding computation. If the computed product has the most significant bit set it is said to have ‘overflowed’ the significand. In this case, as illustrated in FIG. 1, the upper n-bits representing the product begin with the most significant bit, whilst the lower n-bits are used in the rounding computation. If the most significant bit of the product is not set, the resulting product (represented by bits 2n−2 to n−1) is considered ‘normal’ and the n−1 least significant bits (bits 0 to n−2) contribute to rounding.    7. The n-bits of the final product are selected. If the computed product has overflowed, bits [2n−1:n] are selected, whilst if the computed product is normal, bits [2n−2:n−1] are selected. The rounding bits corresponding to the normal or overflowed product are evaluated and a decision is made as to whether it is necessary to increment the final product.    8. If the final n-bit product is to be incremented, a ‘1’ is added to the final product at the least significant point (i.e. bit 0 of the final product).    9. The rounded final product is evaluated for overflow. This condition occurs when the final product was composed of all ones, and the rounding increment caused the final product to generate a carry into bit n (i.e. a bit position immediately to the left of the most significant bit (bit n−1) of the final product), effectively overflowing the n-bits of the result, and requiring a single bit shift right and an increment of the exponent.
The above series of steps are inherently serial, but can be parallelised at several points. For example, it would be desirable to seek to perform the rounding evaluation and any necessary rounding increment without having to first wait for the final product to be produced.
U.S. Pat. No. 6,366,942-B1 describes a technique for rounding floating point results in a digital processing system. The apparatus accepts two floating point numbers as operands in order to perform addition, and includes a rounding adder circuit which can accept the operands and a rounding increment bit at various bit positions. The circuit uses full adders at required bit positions to accommodate a bit from each operand and the rounding bit. Since the proper position in which the rounding bit should be injected into the addition may be unknown at the start, respective low and high increment bit addition circuits are provided to compute a result for both the low and a high increment rounding bit condition. The final result is selected based upon the most significant bit of the low increment rounding bit result. The low and high increment bit addition circuits can share a high order bit addition circuit for those high order bits where a rounding increment is not required, with this single high order bit addition circuit including half adders coupled in sequence, with one half adder per high order bit position of the first and second operands.
Hence, it can be seen that U.S. Pat. No. 6,366,942-B1 teaches a technique which enables the rounding process to be performed before the final product is produced, but in order to do this requires the use of full adders (i.e. adders that take three input bits and produce at their output a carry and a sum bit) at any bit positions where a rounding bit is to be injected.
Full adders typically take twice as long to generate output carry and sum bits as do half adders. As there is a general desire to perform data processing operations more and more quickly, this tends to lead to a reduction in the clock period (also referred to herein as the cycle time) within the data processing apparatus. As the cycle time reduces, the delays incurred through the use of the full adders described above are likely to become unacceptable.
In addition to it being desirable to seek to perform the rounding evaluation and any necessary rounding increment without having to first wait for the final product to be produced, it would also be desirable in such a system to provide an efficient technique for detecting results that are in the underflow range so as to enable a predetermined result to be returned.
The underflow range is divided into two sub-ranges. The first sub-range is immediately below the minimum normal range and is referred to as the “subnormal” range. In this range a subnormal result is returned or a predetermined result value is returned. As an example the predetermined result value may, dependent on the rounding mode employed, be a minimum normal positive result, a minimum normal negative result or a signed zero value, When the returned result is a signed zero rather than a subnormal result the mode of operation of the processor may be referred to as a “flush-to-zero” or “abrupt underflow” mode. In such a mode all values which are below the minimum normal threshold for the target precision and which are not zero result in a signed zero value returned, and optionally a flag signalling this event is set.
The cost of implementation of processing subnormal operands and returning subnormal results is not insignificant. Many processors utilize software support to handle subnormal operands and process underflow or potential underflow conditions. Also, many applications do not require or could not make use of the extended range the subnormal range provides. In graphics applications single-precision floating point values and computations are more than sufficient and the range of data involved is much reduced from even the normal range. In typical graphics processing the “flush-to-zero” mode is sufficiently adequate to make subnormal handling in hardware too costly and undesirable. Hence, for this reason it is often the case that the flush-to-zero mode is adopted.
In the second sub-range, also referred to as the “catastrophic underflow” range the computed value is not zero and is greater than the negative minimum subnormal value and less than the positive minimum subnormal value, and a signed zero is returned for all modes.
FIG. 6 illustrates these ranges in pictorial form for positive values (this figure can be mirrored for the negative part of the axis). For the single-precision and double-precision formats (as defined in the “IEEE Standard for Binary Floating-Point Arithmetic”, ANSI-IEEE Std 754-1985, The Institute of Electrical and Electronic Engineers, Inc., New York, N.Y. 10017, hereafter referred to as the IEEE 754 standard) the value of the floating point numbers at the thresholds are as shown in Table 1 below:
TABLE 1ThresholdSingle-precisionDouble-precisionMinimum subnormal0x000000010x00000000_00000001(1.4 × 10−45)(4.8 × 10−324)Minimum normal0x008000000x00100000_00000000(1.2 × 10−38)(2.2 × 10−308)Maximum normal0x7f7fffff0x7fefffff_ffffffff(3.4 × 1038)(1.8 × 10308)
The IEEE 754 standard allowed for detection of an underflow case in the result to be done before or after rounding, and many processors opted to make the determination before rounding. The decision to determine whether a result value is in the subnormal range, and hence to return a predetermined result value such as a signed zero, is often made before rounding as well. If an approach is to be taken where the rounding takes place without waiting for the final result to be produced, detection of a value in the subnormal or catastrophic underflow range becomes more complicated, particularly when near the subnormal/normal boundary, since the unrounded result value is not computed.
Hence it would be desirable to provide a data processing apparatus and method which, when performed floating point multiplication, enables the rounding evaluation and any necessary rounding increment to be performed without having to first wait for the final product to be produced, whilst also providing an efficient technique for detecting results that are in the underflow range, in situations where such determination would be difficult and/or prohibitively costly earlier in the flow of steps.