Mathematical transforms, such as Fourier transforms, are an important tool for scientific study, signal processing systems, and communication systems. For example, a communication technique now commonly used and referred to as orthogonal frequency division multiplexing (OFDM) employs a discrete Fourier transform (DFT) operation as one of its key steps. DFT converts a finite sequence of numbers into a finite set of sinusoidal components, and OFDM utilizes DFT in demodulation to separate a plurality of orthogonal signals transmitted on different sub-carriers. OFDM is utilized in a variety of communication systems such as wireless local area networks (e.g., the IEEE Standard 802.11a/g), wireless metropolitan area networks (e.g., IEEE Standard 802.16, also known as “WiMAX”), power line communication (PLC) systems, home networking systems (e.g., the Multimedia over Coax Alliance (MoCA) specification), optical communication systems, digital audio broadcast systems (e.g., EUREKA 147, Digital Radio Mondiale, HD Radio, T-DMB and ISDB-TSB), cellular communication systems, digital television broadcast systems (e.g., DVB-T, DVB-H, T-DMB and ISDB-T), etc.
A class of algorithms for efficiently calculating a DFT is referred to as fast Fourier transform (FFT) algorithms. FFT algorithms may be implemented in either hardware or software. For real-time and/or high throughput implementations, an FFT is often implemented using hardware. Different types of FFT algorithms lead to different hardware architectures. One broad class of architecture that may be used to implement an FFT algorithm is a pipelined architecture.
FIG. 1 is a block diagram of a prior art pipelined FFT calculator 100 for implementing a particular FFT algorithm referred to as a radix-22 decimation in frequency (DIF) algorithm. In particular, the FFT calculator 100 is for calculating a 1024-point FFT (i.e., 1024 inputs are converted to 1024 outputs). The FFT calculator 100 includes ten butterfly calculation units 104, 108, 112, 116, 120, 124, 128, 132, 136 and 140. The FFT calculator 100 also includes four multiplier units 144, 148, 152 and 156. The FFT calculator 100 will be further described with reference to FIG. 2.
FIG. 2 is a flow diagram of a 4-point radix-22 DIF algorithm 170. In general, an N-point radix-22 DIF algorithm comprises log2N stages (hereinafter referred to as FFT stages). Thus, in FIG. 2, the 4-point radix-22 DIF algorithm includes two FFT stages. Each FFT stage includes a plurality of operations that are referred to as butterfly operations. A butterfly operation includes adding two input values to generate a first output value, and subtracting the same two input values to generate a second output value. Thus, the first stage of the algorithm 170 includes a first butterfly operation involving inputs x(n) and x(n+2) and a second butterfly operation involving inputs x(n+1) and x(n+3).
In an FFT stage, each output of a butterfly operation is multiplied by a complex-valued parameter, but in some cases the parameter may merely be real value of one or a value of −j. Thus, when implementing a radix-22 DIF algorithm in hardware, it may be possible to omit a complex-value multiplier in some of the FFT stages. For example, in the first FFT stage of the algorithm 170, three of the outputs are multiplied by real-valued one and the remaining output is multiplied by −j. Thus, hardware for implementing the first FFT stage may omit a complex-value multiplier. Rather, the multiply by −j operation may be implemented using logic to swap the real and imaginary components of an output and to change the sign of the new imaginary component.
On the other hand, in the second FFT stage of the algorithm 170, three of the outputs are multiplied by complex-valued parameters W1, W2 and W3 (often referred to as “twiddle factors”). Thus, a complex multiplier is needed for the second FFT stage.
An FFT stage may potentially include a butterfly calculation (BF) stage and a multiplier stage. A BF stage calculates butterfly calculations for an FFT stage. If an FFT stage only requires a multiply by −j, such an FFT stage may omit a multiplier stage and rather include logic to implement the multiply by −j such as the logic described above. Thus, in the algorithm 170, the first FFT stage includes a BF stage and omits a multiplier stage, whereas the second FFT stage includes both a BF stage and a multiplier stage.
As discussed above, FIG. 2 corresponds to a 4-point radix-22 DIF algorithm and includes two FFT stages. In general, an N-point radix-22 DIF algorithm, where N is a power of two, will have log2N FFT stages. For example, a 1024-point radix-22 DIF algorithm will have ten FFT stages (first, second, . . . , tenth). Similarly, an N-point radix-22 DIF algorithm, where N is a power of two, will have log2N BF stages. For example, a 1024-point radix-22 DIF algorithm will have ten BF stages (first, second, . . . , tenth).
Referring again to FIG. 1, each of the ten butterfly calculation units 104, 108, 112, 116, 120, 124, 128, 132, 136 and 140 corresponds to an FFT stage of the 1024 point radix-22 DIF algorithm. More specifically, each of the ten butterfly calculation units 104, 108, 112, 116, 120, 124, 128, 132, 136 and 140 corresponds to a BF stage. Each of the five butterfly calculation units 104, 112, 120, 128 and 136 correspond to a BF stage similar to the first BF stage of FIG. 2 in that they include logic to implement a multiply by −j, whereas each of the five butterfly calculation units 108, 116, 124, 132 and 140 correspond to a BF stage similar to the second BF stage of FIG. 2 in that they do not include such logic. Similarly, each of the four multiplier units 144, 148, 152 and 156 corresponds to a multiplier stage of the 1024-point radix-22 DIF algorithm.
Each of the butterfly calculation units 104, 108, 112, 116, 120, 124, 128, 132, 136 and 140 includes a respective memory 160, a respective butterfly calculator 162, and a respective controller 164. The memory 160 is for storing inputs to the butterfly calculation unit so that the inputs can be used for later calculations. Referring to FIG. 2, inputs to a butterfly calculation unit may be received in sequence in the following order: x(n), x(n+1), x(n+2), x(n+3), where x is an input value and n is an index. Thus, in order to calculate the stage output 174 (i.e., x(n)−x(n+2)), the value x(n) must be stored until the value x(n+2) is received.
One of ordinary skill in the art will understand that a first FFT stage of an N-point radix-22 DIF algorithm, where N is a power of 2, will require at least N/2 memory locations, and each subsequent stage will require ½ the memory locations of the previous stage. Referring again to FIG. 1, the memories 160 may include 512, 256, 128, 64, 32, 16, 8, 4, 2 and 1 memory locations for the butterfly calculation units 104, 108, 112, 116, 120, 124, 128, 132, 136 and 140, respectively.
Each butterfly calculator 162 performs additions and subtractions for the butterfly calculation unit. Each control unit 164 generally controls the butterfly calculation unit to generate outputs for the stage. For example, the control unit 164 may configure the butterfly calculator 162 to perform either an addition or a subtraction. Additionally, the control unit 164 may route appropriate values from the memory 160 to the butterfly calculator 162. Further, the control unit 164 may change the sign of an imaginary component of an output when appropriate.
Each of the multiplier units 144, 148, 152, and 156 corresponds to an FFT stage of the 1024-point radix-22 DIF algorithm that requires a complex multiplier. More specifically, each of the multiplier units 144, 148, 152, and 156 corresponds to a multiplier stage, such as the multiplier stage of FIG. 2. Each of the multiplier units 144, 148, 152, and 156 includes a respective memory 166 and a respective complex multiplier 168. The memory 166 is for storing complex valued “twiddle factors”. One of ordinary skill in the art will understand that a first multiplier stage of an N-point radix-22 DIF algorithm, where N is a power of 2, will be configured to multiply using approximately N/8 different twiddle factors, and each subsequent multiplier stage will use approximately ¼ the number of twiddle factors of the previous multiplier stage. Thus, the memories 166 may include 128, 32, 8 and 2 memory locations for the multiplier units 144, 148, 152 and 156, respectively.
In operation, input values (i.e., x(n), x(n+1, x(n+2), . . . ) are provided sequentially to the butterfly calculation unit 104, which stores 512 input values in its memory 160. Once 512 input values have been stored, the butterfly calculation unit 104 begins calculating outputs that correspond to outputs of the first stage of the 1024-point FFT algorithm, which is similar to the first FFT stage in FIG. 2. These outputs are sequentially provided to the butterfly calculation unit 108, which stores 256 of these values in its memory 160. Once 256 values have been stored, the butterfly calculation unit 108 begins calculating outputs that correspond to outputs of the second BF stage of the 1024-point FFT algorithm, which is similar to the second BF stage in FIG. 2. These outputs are sequentially provided to the multiplier unit 144, which multiplies each output of the butterfly calculation unit 108 by a corresponding twiddle factor value stored in its memory 166. Outputs of the multiplier unit 144 are provided to the butterfly calculation unit 112. In a similar manner, the remainder of the butterfly calculation units and multiplier units operate to calculate the other stages of the FFT algorithm. Eventually, output values are generated by the butterfly calculation unit 140.