1. Field
This disclosure relates generally to techniques for performing discrete Fourier transforms and, more specifically, to techniques for performing discrete Fourier transforms on radix-2 platforms.
2. Related Art
An electrical signal may be represented in the time-domain (as a variable that changes with time) or may be represented in the frequency-domain (as energy at specific frequencies). In the time-domain, a sampled digital signal includes a series of data points that correspond to an original physical parameter, e.g., light, sound, temperature, and velocity. In the frequency-domain, a sampled digital signal is represented as discrete frequency components, e.g., sinusoidal waves. A sampled digital signal may be transformed from the time-domain to the frequency-domain using a discrete Fourier transform (DFT). Conversely, a sampled digital signal may be transformed from the frequency-domain to the time-domain using an inverse DFT (IDFT).
As is well known, a DFT is a digital signal processing transformation that is employed in various applications. DFTs and IDFTs facilitate signal processing in the frequency-domain, which can provide efficient convolution integral computation (which is, for example, useful in linear filtering) and signal correlation analysis. As the direct computation of a DFT requires a relatively large number of arithmetic operations, the direct computation of a DFT is typically not employed in real-time applications. Various fast Fourier transform (FFT) algorithms have been created to perform real-time tasks, such as digital filtering, audio processing, and spectral analysis for speech recognition. In general, FFT algorithms reduce a computational burden such that DFT approaches may be effectively employed for real-time signal processing.
The computational burden associated with a DFT is a measure of the number of calculations required by a DFT algorithm. A typical DFT algorithm starts with a number of input data points and computes a number of output data points. The DFT function is a sum of products, i.e., multiplications to form product terms followed by addition of the product terms to accumulate a sum of products (multiply accumulate (MAC) operations). The direct computation of a DFT may require a relatively large number of MAC operations as the number of input data points (i.e., a size of the DFT) is increased. Moreover, multiplications by twiddle factors tend to greatly increase computational workload. To reduce the computational burden imposed by the computationally intensive DFT, researchers have developed various FFT algorithms in which the number of required mathematical operations is reduced. In one class of FFT algorithms, a computational burden is reduced based on a divide-and-conquer approach. In this class of FFT algorithms, input data are divided into subsets for which the DFT is computed to form partial DFTs. Using this approach, either decimation-in-frequency (DIF) or decimation-in-time (DIT) approaches are employed to divide (decimate) larger calculation tasks into smaller calculation subtasks.
For example, an N-point DFT can be divided into N/2 2-point partial DFTs. The basic 2-point partial DFT is calculated in a computational element known as a radix-2 DIT butterfly or a radix-2 DIF butterfly. A radix-2 butterfly has two inputs and two outputs, and computes a 2-point DFT. For example, an 8-point DFT may be computed using twelve radix-2 butterflies (or butterfly computing elements). As is well known, butterfly computing elements are generally arranged in stages. That is, input data is fed to inputs of butterfly computing elements in one stage, which provides results to inputs of a next stage of butterfly computing elements. For example, to compute an 8-point DFT on a radix-2 platform, four radix-2 butterflies operate in parallel in a first stage to compute eight partial DFTs. The eight partial DFTs (outputs) of the first stage are combined in two additional stages to provide a complete 8-point DFT output. Specifically, a second stage of four radix-2 butterflies and a third stage of four radix-2 butterflies comprise a two stage combination phase in which eight radix-2 butterflies (responsive to eight partial DFTs) form a final 8-point DFT output.
As another example, a 16-point DFT may be computed using thirty-two radix-2 butterflies to compute a 16-point DFT. In this case there are four stages of butterfly calculations. That is, eight radix-2 butterflies operate in parallel in a first stage, where eight 2-point partial DFTs are calculated. Outputs of the first stage are combined in three additional combination stages to form the 16-point DFT output. An output of a second stage of eight radix-2 butterflies is coupled to a third stage of eight radix-2 butterflies. An output of the third stage of eight radix-2 butterflies is coupled to a fourth stage of eight radix-2 butterflies, which provide a final 16-point DFT. As is apparent from the above description, in a butterfly implementation of a DFT, butterfly calculations in different stages cannot be performed in parallel. That is, subsequent stages of butterflies cannot begin calculations until earlier stages of butterflies have completed prior calculations.