A digital signal processor (DSP) is a special-purpose CPU utilized for digital processing and analysis of signals from analogue sources, such as sound. The analog signals are converted into digital data and analyzed using various algorithms, such as Fast Fourier Transforms. DSPs are designed for particularly fast performance of certain operations, such as multiplication, multiplying the accumulating, and shifting and accumulating, because the math-intensive processing applications for DSPs rely heavily on such operations. For this reason, a DSP will typically include special hardware circuits to perform multiplication, accumulation and shifting operations.
One popular form of DSP architecture is known as a Multiply-Accumulate or MAC processor. The MAC processor implements an architecture that takes advantage of the fact that the most common data processing operations involve multiplying two values, then adding the resulting value to another and accumulating the result. These basic operations are efficiently carried out utilizing specially configured, high-speed multipliers and accumulators, hence the "Multiply-Accumulate" nomenclature. In order to increase the processing power of MAC processors, they have been designed to perform different processes concurrently. Towards this end, DSP architectures with plural MAC structures have been developed. For example, a dual MAC processor is capable of performing two independent MAC operations concurrently.
An addition operation in a processor, such as a digital signal processor, involves either adding or subtracting two or more numbers. These numbers may be represented in radix-2 (binary), radix-4, or any other radix. Subsequent to or in parallel with the addition operation, the result of the addition operation (here referred to as a sum) is evaluated to determine whether an overflow has occurred. If an overflow has occurred, the sum is saturated. Saturating means setting to the largest quantity, positive or negative, capable of being represented by the processor. If an overflow occurs in a negative sense, the sum is set to the largest negative number. If an overflow occurs in a positive sense, the sum is set to the largest positive number.
Bit exact standards have been written for processor architectures that contain a single Multiply-Accumulate (MAC) unit. Such single MAC processors, typically have one two-input adder, and saturate a sum following each addition operation. Multiple operands can be added in a sequential fashion in such single MAC processor.
Faster addition can be accomplished in processors containing multiple (more than one) MAC units by simultaneously adding together multiple operands in a multiple-input adder. However, the resulting sum generated on a multiple MAC processor can be different than the sum generated on a single MAC processor. The difference results from the fact that the intermediate sums are saturated during sequential addition on a single MAC processor. Bit exact standards that have been developed for single MAC processors cannot exploit the multiple-input adders in a multiple MAC processor unless a technique is developed that can be used to add together multiple operands on multiple MAC processors while saturating intermediate results.
One way to accomplish this is disclosed in U.S. patent application Ser. No. 08/927,558, filed Sep. 8, 1997 now U.S. Pat. No. 5,889,689, and entitled "Hierarchal Carry Select, Three-Input Saturation", the disclosure of which is hereby incorporated by reference. This technique works for three-operand addition with intermediate saturation, but cannot easily be extended to multiple-operand addition. This technique introduces additional delay into the critical path of the circuit.