Many digital data processors, including most DSPs and multimedia processors, use binary fixed-point arithmetic, in which operations are performed on integers, fractions, or mixed numbers in unsigned or two's complement binary format. DSP and multimedia applications often require that the processor be configured to perform both saturating arithmetic and wrap-around arithmetic on numbers in a given binary format. In saturating arithmetic, computation results that are too large to be represented in a specified number format lead to an overflow condition, and are saturated to the most positive or most negative number. In wrap-around arithmetic, results that overflow are wrapped around, such that any digits that cannot fit into the specified number representation are simply discarded.
FIG. 1 illustrates three example binary number formats that are commonly used in digital processors.
The first format shown is a mixed-number format, which includes one sign bit, g guard bits, and f fraction bits. The guard bits are additional integer bits that are used to reduce the likelihood of overflow in intermediate calculations. The binary point is between the guard bits and fraction bits. A typical 40-bit mixed-number format for representing an operand includes one sign bit, eight guard bits and 31 fraction bits.
The second format is a fractional format, which includes one sign bit and f fraction bits, but no guard bits. The binary point in this particular format is between the sign bit and the fraction bits. A typical 32-bit fraction format for an operand includes one sign bit and 31 fraction bits.
The third format is a sign-extended fractional format, which includes g extend bits, one sign bit, and f fraction bits. The binary point is between the sign bit and the fraction bits, and the extend bits are identical to the sign bit. This format thus allows a saturated result in fractional format to be sign-extended so that the result has the same number of bits as the mixed-number format.
In a variety of applications, it is useful to perform operations on operands in a mixed-number format, fractional format or sign-extended fractional format, and produce saturated results that are in a mixed-number format, fractional format or sign-extended fractional format.
It is also useful to have a single adder or other arithmetic unit that can take inputs and produce results in the mixed-number format, fractional format or sign-extended fractional format.
There are a number of techniques known in the art for performing overflow detection and saturation with two's complement addition. For example, when input and result operands all use the same format, overflow is often detected by examining the sign bits of the input and result operands. If the input operands have the same sign and the sign of the result is different, then overflow has occurred and the result should be saturated; otherwise overflow is guaranteed not to have occurred. Another method for detecting this same condition is to examine the carries into and out of the sign bit. If the carry into the sign bit differs from the carry out of the sign bit, then overflow has occurred and the result should be saturated; otherwise overflow is guaranteed not to have occurred. Although these techniques work well when the input and result operands use the same format, they generally cannot be used when the input and result operands have different formats.
A straightforward mechanism for performing two's complement saturating addition when the input operands are in mixed-number format and the result operands are in fractional format or sign-extended fractional format involves producing a result in the mixed-number format and then examining that result and the sign bits of the input operands to determine if overflow has occurred and if the final result should be saturated. This can be accomplished by having one circuit that detects if overflow occurs in the mixed-number format and a second circuit that detects if the mixed-number result cannot be exactly represented in the fractional format. The first circuit can detect overflow by examining the sign bits of the input operands and the sign bit of the result, as described above. The second circuit can detect overflow by comparing the sign bit of the result with the guard bits of the result. If the sign bit differs from any of these guard bits, then overflow has occurred. Although this approach correctly detects overflow, it has the disadvantage that the guard bits of the result must be computed before it can be determined if overflow has occurred.
FIG. 2 shows a number of examples of two's complement addition for a case in which the input operands are in mixed-number format, and the final result is in sign-extended fractional format, with g=3 and f=4. The left side of each equation gives the two's complement value and the right side gives the decimal value.
In the first example, two positive numbers, 0.50 and 0.75, are added together. Since their sum, 1.25, cannot be exactly represented as a sign-extended fractional number, positive overflow occurs and the sum is saturated to the most positive number in the specified output format, which in this case is 0.9375.
In the second example, two negative numbers, −0.25 and −2.00, are added together. Since their sum, −2.25, cannot be exactly represented as a sign-extended fractional number, negative overflow occurs and the sum is saturated to the most negative number in the specified format, which in this case is −1.00.
In the third example, a positive number, 1.50, and a negative number, −2.00, are added together. Since their sum, −0.50, can be represented as a sign-extended fractional number, overflow does not occur and the final sum is not saturated.
As indicated above, conventional techniques for performing two's complement addition operations of the type shown in FIG. 2 are problematic in that those techniques either require the input operands to be in the same format as the result, or require computation of guard bits before overflow detection can begin.
Accordingly, a need exists for an improved arithmetic unit that is capable of performing addition or other operations in a digital data processor without the drawbacks of the above-described conventional techniques.