Fourier transforms (“FT”) decompose linear signals or waveforms into sinusoids identified by frequency. The sum of the various frequency sinusoids equals the original waveform. One type of FT is the Discrete Fourier Transform (“DFT”). DFTs decompose original signals into discrete sample values. The samples are identified by discrete frequency and amplitude values. Digital devices require discrete sample values and thus implement DFTs. One type of DFT is the Fast Fourier Transform (“FFT”). FFTs reduce DFT computation time by employing a “divide and conquer approach.” The “divide and conquer” approach divides data points into a plurality of FFT subsets. Two examples of FFT “divide and conquer” approaches are decimation in time (“DIT”) and decimation in frequency (“DIF”).
When implementing an FFT “divide and conquer approach,” the processor computing the FFT calculation performs the FFT calculation in each FFT subset that the data points have been separated into. Because each FFT subset contains a number of data points fewer that the original number of data points, the total number of calculations performed is reduced. For example, by halving the data points, only one quarter of the calculations is performed for the full DFT implementation in each FFT subset of data points.
The internal architecture of Digital Signal Processors (“DSPs”) often determines which “divide and conquer approach” is utilized. For example, with DSPs such as Motorola DSP56001 and DSP96002, the DIT approach is used because the DIT approach is computed in fewer instruction cycles than the DIF approach. While the present invention is described with regard to the DIT approach, it should be understood that the present invention is equally compatible to other “divide and conquer” approaches, such as the DIF approach.
One type of FFT is the radix-2 FFT. The radix-2 FFT recursively divides data points. A binary shift right method is used to divide the data points at each output stage of the radix-2 FFT butterfly. By halving the output stage data points, the radix-2 FFT reduces the original number of data points by a factor of two. Consequently, the computational complexity of the radix-2 FFT is reduced by N, the number of data points, multiplied by the base two log of N, i.e., N log2 N. By reducing the number of data points, the output is accordingly reduced, thereby preventing overflow errors. Overflow occurs when memory size is not large enough to store all the data points. To prevent overflow in an eight (8) point radix-2 butterfly FFT implementation, typically either the inputs are limited to one eighth of the available dynamic range, or an output division method is utilized whereby a division by two is performed at each stage of the output.
Another type of FFT is the radix-4 FFT. While the radix-2 FFT reduces the original number of data points by a factor of 2, the radix-4 FFT reduces the number of data points by a factor of 4. The radix-2 FFT calculates a separate FFT for each data point subset. The radix-4 FFT divides the original data set into four subsets, and then calculates a separate FFT for each of the four subsets. Another FFT algorithm is the split-radix FFT.
The split-radix FFT is a hybrid combining features from the radix-2 and the radix-4 FFT approaches. Similar to the radix-4 FFT, the butterfly of the split-radix has four inputs. Similar to the radix-2 FFT implementation, the butterfly of the split-radix divides the output into two subsets. One subset calculates the radix-2 FFT, while the other subset calculates the radix-4 FFT. By mixing the two radixes, the split-radix FFT enables an implementation which minimizes the multiplication calculations.
FFT implementations are commonly performed using a butterfly block system. While the butterfly block system differs slightly for the DIF and DIT approaches the present invention will be described with reference to a DIT approach. However, as mentioned above, the present invention is equally compatible with the DIF butterfly block system. The FFT with its recursive nature is calculated by interactively applying a conventional butterfly system block. Each iteration is called a stage.
The structure of the prior art radix-2 FFT butterfly block system can be seen in FIG. 1. The radix-2 FFT butterfly outputs, C and D, can be up to twice the dynamic range of the inputs, A and B. The C and D outputs are defined by:D=A−e−jB  (eq. 1)C=A+e−jB  (eq. 2)
The inputs, A and B, of the radix-2 FFT butterfly system block can be two complex or real numbers. With fixed-point radix-2 FFT implementations, the inputs, A and B, will be integers that utilize a fixed storage size such as 16 or 32 bits per integer. The outputs, C and D, are two corresponding complex or real numbers which can use the same fixed-point representation. With fixed-point complex number implementation, the outputs are:Real(D)=Real(A)−((Real(B)*Real(Ep.)−Imag(B)*Imag(Ep))>>N;  (eq. 3)Imag(D)=Imag(A)−((Real(B)*Imag(Ep)+Imag(B)*Real(Ep))>>N;  (eq. 4)Real(C)=Real(A)+((Real(B)*Real(Ep)−Imag(B)*Imag(Ep))>>N;  (eq. 5)Imag(C)=Imag(A)+((Real(B)*Imag(Ep)+Imag(B)*Real(Ep))>>N;  (eq. 6)wherein A and B represent inputs; C and D represent outputs; “Imag” represents the imaginary part of the complex number; “Real” represents the real part of the complex number; “>>N” represents the binary operation “shift right by N bits,” equivalent to multiplication by 2−N; and Ep represents the fixed point representation of e−j derived from the formula:Ep=Round(e−j*2N)  (eq. 7)wherein the Round function rounds the solution to the nearest integer. FFT implementations utilize floating-point, integer, and fixed-point processors. A floating-point number has both a mantissa and an exponent and accordingly requires a subset of bits for both. Floating-point processors typically maintain precision to the size of the mantissa and therefore have greater dynamic range than other types of processors. Floating-point processors can store large numbers of data points such that overflow errors are avoided.
Fixed-point and integer processors however unlike floating point processors do not have the same overflow capacity. Sometimes fixed-point processors encounter overflow problems. However, even with the overflow problems, some applications require fixed-point and integer processors. Fixed-point or integer processors are advantageous in terms of size, speed and cost for certain applications. Integer or fixed-point processors often must be scaled to avoid overflow problems. However, scaling takes time, results in further reductions of precision and/or dynamic range, and detrimentally affects a system's signal-to-noise ratio.
To understand overflow errors one can look at the radix-N butterfly processor. In a radix-N butterfly processor every output is the sum of each of the N inputs multiplied by the applicable rotation factor (e−j). The rotation factor, e−j, has a unity amplitude. Potentially, the outputs can be up to N times larger than the inputs. If the same processor reserves the same memory for both input and output, capacity overflow will occur. This is because as mentioned above with radix-N butterfly processors output can be up to N times larger than input. Thus, to avoid these overflow errors, the radix-N processors is scaled by recursively dividing the output at each butterfly stage by N. In a radix-2 FFT, for example, the number of the outputs at each butterfly stage is recursively divided by two. In an 8 point radix-2 FFT, in stage 1, four butterflies will be performed on the 8 inputs, thereby creating 8 outputs. Then in stage 2, the radix-2 FFT groups the 8 outputs into two sets of four. In stage 2, two butterflies will be performed on each set of four. Thereby, implementing the N log2 N equation. Consequently, also creating a potential overflow problem. Another prior art FFT performs the butterflies before each stage of butterflies. In this case, without any additional scaling, the size of the outputs can be up to twice as large as the inputs. To prevent overflow, prior art designs performed a division by two.
The overflow prevention methods are found in FFT communication systems that use Discrete Multi-Tone (“DMT”) modulation, a form of multi-carrier modulation. DMT modems typically use an FFT system structure and are well known in the art. See, for example, John A. C. Bingham, Multi carrier Modulation for Data Transmission: An Idea Whose Time Has Come, IEEE COMMUNICATIONS, May 1990, at 5–14, which is herein incorporated by reference in its entirety and J. Cioffi, A Multicarrier Primer, T1E1.4/91-157, ANSI T1E1.4, which is incorporated herein by reference in its entirety. As described in the ANSI standards documents, the line code for Asymmetric Digital Subscriber Line (“ADSL”) modems is DMT as discussed in Network and Customer Installation Interfaces—Asymmetric Digital Subscriber Line (ADSL) Metallic Interface, ANSI T1.413 (1998), which is incorporated herein by reference in its entirety. Moreover, methods for fixed-point calculation of FFTs are also well known in the art as discussed in P. Duhamel and M. Vetterli, Fast Fourier transform: A tutorial review and a state of the art, IEEE SIGNAL PROCESSING, vol. 19, no. 4, April 1990, at 259–299 and Guy R. L. Sohie, Implementation of Fast Fourier Transforms on Motorola's DSP56000/DSP56001 and DSP 96002 Digital Signal Processors, Motorola Inc. (1991), which are incorporated herein by reference in its entirety.
In sum, to prevent overflows in the radix-N fixed-point FFT calculation, prior art processors constrain either the range of inputs or outputs before storing in memory. As discussed in a radix-2 FFT implementation, it is common to divide each butterfly output by two. With each butterfly output divided by two, there is a loss of one bit of dynamic range. If outputs are not constrained, in the alternative, inputs can be reduced. The other approach limits input range before each butterfly stage by 1/N, by using guard or unused input bits. In any case, prior art methods to avoid overflow errors have the disadvantage that they limit the processor's dynamic range.