High-speed computation of floating point numbers is a critical design factor in many systems such as computers, signal processors and process controllers. Floating point representations of numbers are normally used in these systems because of the large dynamic range. Advanced technology makes it possible to fabricate an integrated circuit which can multiply floating point numbers using highly parallel techniques to improve speed.
Parallel array multipliers generate all partial products simultaneously and then add the partial products in an array of adders. The array of adders reduce the number of partial products to two numbers, often referred to as the sum stream and the carry stream. The sum and carry streams then are combined in a final adder to produce the product. The final addition requires about the same period of time as the addition of the partial products, because of the possibility of a low order bit resulting in a carry propagating to a much higher bit, called a carry chain. Thus, a pipeline register is often inserted between the adder array and the final adder.
Some multipliers use signed digit redundant number representation to take advantage of a tree approach to add the partial products in a parallel fashion while maintaining an iterative structure that increases circuit density and ease of layout. Signed digit representation uses two bits at each bit position to represent a 1, 0, or -1. Signed digit adders avoid long carry chains and the delays associated therewith. The signed digit adder array adds the partial products to a single signed digit number. However, since the signed digit representation is not a common format, it must be converted to conventional representation such as a binary magnitude representation. The conversion circuit is very similar to the final adder in the parallel array approach. Signed digit addition is explained in greater detail in Takagi, et al., High Speed VLSI Multiplication Algorithm with a Redundant Binary Addition Tree, IEEE Transactions on Computers, Vol. C-34, No. 9, September, 1985.
Further, in floating point multiplication, the product must be "normalized" such that the most significant bit is a "1". If the operand mantissas are N bits long, the resulting product mantissa is at most 2 N bits in length. To fit the original floating point format of an N bit length, the product is normalized and rounded. If the original mantissas are normalized, the normalization shift will be at most one bit. The rounding, however, may result in a carry propagating through the entire N bit number.
Thus, the multiplier must convert, normalize and round the final product. Typically, the conversion is performed first, since in signed digit representation it requires a long period of time to determine whether the leading bit of the equivalent magnitude number is a "0" or a "1" prior to conversion. Next, normalization is performed, since the normalization shift will determine which bit is rounded. Since both the conversion and the rounding may entail long carry chains, this approach significantly reduces the speed of the multiplication.
Therefore, a need has arisen for a method and apparatus for converting, normalizing and rounding a sum of partial products at a high speed.