Digital signal processors (DSP) often implement Fast Fourier Transforms. The Fast Fourier Transforms (FFT) have a need to calculate sums and differences and the result is scaled down by using a divider circuit which divides by 2 or 4. This division can often be optimized by utilization of shift operations. A divide by 2 is a right shift of one-bit, and a divide by 4 is a right shift by 2-bits.
Since Fast Fourier Transforms are heavily used in some DSP applications, it would be advantageous to be able to optimize the divide by 2 or 4 of sums and differences.
One solution to this problem is to use two sets of adders along with a rounding logic network to perform addition and round-off. This method has a disadvantage that it uses additional adder cells over a standard arithmetic logic unit (ALU). These additional adder cells are used to hold the lower significant bits before a final round-off addition.
In another high-speed floating point design, the two sum or difference operands are added, and the result is then rounded before sending the intermediate output value to a shifter for normalizing a final result. One problem with this approach is that the adder array must have equal numbers of bit cells for each bit and must prevent the circuit from early overflow. This requires more gates to implement and also slows down the carry chain timing.