The present invention relates to floating-point processors, and more particularly to floating-point processors having improved accuracy for multiply-add (Madd) operations.
In digital processing systems, numerical data is typically expressed using integer or floating-point representation. Floating-point representation is preferred in many applications because of its ability to express a wide range of values and its ease of manipulation for some specified operations. A floating-point representation typically includes three components: a sign bit (sign), a mantissa (mant) that is sometimes referred to as a significand, and an exponent (exp). The represented floating-point number can be expressed as (−1)sign·mant·2exp. Floating-point representations are also defined by “IEEE Standard for Binary Floating-Point Arithmetic,” which is referred to herein as the IEEE-754 standard (or simply the IEEE standard) and incorporated herein by reference in its entirety for all purposes.
Many operations can be performed on floating-point numbers, including arithmetic operations such as addition, subtraction, and multiplication. For arithmetic operations, the IEEE standard provides guidelines to be followed to generate a unique answer for each floating-point operation. In particular, the IEEE standard describes the processing to be performed on the result from a particular operation (e.g., multiply, add), the precision of the resultant output, and the data format to be used. For example, the IEEE standard defines several rounding modes available for the results from add and multiply operations, and the bit position at which the rounding is to be performed. The requirements ensure identical results from different implementations of IEEE-compliant floating-point processors.
Many applications perform multiplication on two operands and addition (or subtraction) of the resultant product with a third operand. This multiply-add (or Madd) operation is common, for example, in digital signal processing where it is often used for computing filter functions, convolution, correlation, matrix transformations, and other functions. The Madd operation is also commonly used in geometric computation for (3-D) graphics applications.
Conventionally, a Madd operation can be achieved by sequentially performing a multiply (MUL) operation followed by an add (ADD) operation. Performing the operations sequentially results in long processing delay. Improved performance can often be obtained by performing the Madd operation using a specially designed unit that also supports conventional floating-point multiplication and addition.
For Madd operations, post-processing is typically performed on the intermediate result from the multiply portion. To obtain a final Madd output that fulfills IEEE rounding requirement, the post-processing includes possible denormalization and rounding of the intermediate result in accordance with one of the rounding modes defined by the IEEE standard. Denormalization is performed on a denormalized number (i.e., a non-zero number between the smallest positive representable normalized number, +amin, and the smallest negative representable normalized number, −amin) to place the denormalized number in a proper format such that rounding can be performed at the bit location specified by the IEEE standard. The post-processing (or more specifically, the denormalization and rounding) to generate an IEEE-compliant Madd result typically lead to reduced accuracy (since some bits may be discarded during the denormalization and rounding), increased hardware complexity, and increased processing time. To reduce hardware complexity and improve processing time, some Madd architectures provide an additional operating mode in which numbers (e.g., intermediate results) smaller than the smallest representable normalized number are set or flushed to zero, or some other values such as amin. However, the flush-to-zero mode suffers from a higher loss in accuracy since the mantissa is replace with zero or some other predefined minimum value.
Accordingly, for Madd operations, techniques that increase the accuracy of the output result, simplify the post-processing of the intermediate result, and reduce the overall processing time are highly desirable.