A common task in signal processing is to determine a frequency spectrum of a received signal, given a signal sampling in the time domain. In Digital Signal Processing, discrete samples of the signal are taken in time. A Discrete Fourier Transform is applied to the samples, and the frequency spectrum is calculated. The frequency spectrum is also discrete, each value indicating the contribution of a different frequency bin to the signal.
For N time domain samples, the execution time of the Discrete Fourier Transform for N frequency domain samples is on the order of N2. The determination of the frequency spectrum can be vastly accelerated using a Fast Fourier Transform (FFT). The execution time of the FFT is proportional to N log2 N. The proportionality factors are of the same order of magnitude, so even for a sample size as low as 1000 samples the FFT is about 100 times faster than the Discrete Fourier Transform. For this reason, the FFT is a critical tool in determining the frequency spectrum of most practical signals.
The FFT achieves this speed advantage by effectively breaking the analysis down into a series of smaller analyses, and reconstructing the results. The FFT is faster because n analyses of two samples is faster than one analysis of 2 n samples. The FFT is performed in two steps: decomposition and synthesis. (There is a third, intermediate step of frequency determination, but nothing is done at this step other than realizing that each time sample now represents a frequency sample. No calculations are carried out.) In the decomposition step additional samples having a value of 0 are added to the signal as necessary in order that the resulting signal has N=2k samples, with k being a whole number. The signal of N samples is then decimated repeatedly, leaving N signals of one sample each. As part of the decimation, the samples are reordered in a particular manner.
In the synthesis step, the N signals of one sample (which now represent frequencies) are repeatedly combined in pairs until a single signal of N frequency samples is obtained. The resulting single signal is the frequency spectrum of the original signal. The combination of frequency signals occurs in stages. At each stage, the number of signals is halved and the number of samples in each signal is doubled, so a signal having N samples will require log2 N stages during the synthesis step. During each stage, larger size signals (called blocks) are generated in turn. Within each block, new frequency samples are generated in turn from two frequency samples, one from each of two lower size signals, using a butterfly calculation. All frequency samples within a block are generated from frequency samples from the same two lower size blocks. After log2 N stages, the N signals of one frequency sample will have been combined into a single signal of N frequency samples.
Unfortunately, for large data sizes even the FFT may be too slow, especially for real-time applications. With the rapid increase in processor speed, multiprocessor platforms using parallel processing to achieve sufficient speed for real-time applications can in theory be replaced with less expensive single processor platforms. However single processor platforms are still limited by the speed of peripherals, such as a Direct Memory Access (DMA) unit. The slowness arises not from the number of calculations, but from communication bottlenecks within the platform. The large number of samples can not all be stored in internal memory (such as IDRAM (Internal Dynamic Random Access Memory)), but rather must be stored in external memory (such as SDRAM (Synchronous Dynamic Random Access Memory)).
A processor performing the FFT must import a limited amount of data across a bus into internal memory, perform part of the FFT to produce new data, and export the new data to external memory. Typically, the processor performs one butterfly calculation at a time to generate one new frequency sample. This requires one import and one export of data per butterfly calculation, or (N log2 N)/2 imports and exports per synthesis. Despite the speed of the processor, the memory access time will constrain the speed of the FFT for large sample sizes in which much data must be exchanged between external and internal memory.