Conventional FFT circuits employ a number of basic computation elements, known as butterflies, and examples of conventional butterflies 100 and 110 can be seen in FIGS. 1 and 2. As shown, butterfly 100 is a radix-2 (two input) butterfly (which generally employs summing circuits 104-1 and 104-2 and complex multiplier 102), and butterfly 110 is a radix-4 (four input) butterfly (which generally employs summing circuits 112-1 to 112-8 and complex multipliers 114-1 to 114-3). With the FFT computations, the number of butterfly operations is generally:
                                          N            r                    ⁢                      log            r                    ⁢          N                ,                            (        1        )            where N is the number of points in the sequence and r is the radix number (i.e., radix-2). So, the throughput is generally limited by the number of butterfly operations executed in parallel.
To perform these FFT computations, several types of architectures may be employed, examples of which can be seen in FIGS. 3 and 4. In FIG. 3, a flow graph for an 8-point decimation-in-frequency (DIF) FFT is shown. Here, a variable geometry architecture (usually accomplished through the use of multiplexers), which uses a Cooley-Tukey algorithm, can be seen. The flow graph in FIG. 3 is frequently implemented in pipelined FFT circuits. Namely, these pipelined FFT circuits employ logrN datapaths to compute one row of the flow graph, with memory elements at each stage to store the butterfly outputs and ensure entrance to the next stage in the correct order. Typically, in high throughput designs, multiple pipelines were employed to increase speed of computation. In FIG. 4, another flow graph for an 8-point DIF FFT is shown. In this example, the geometry is constant. This is beneficial in that the geometry can be realized with fixed wires or traces, avoiding the overhead of multiplexers. Each of these different architectures, though, has drawbacks (i.e., high switching overhead or high number of non-trivial complex multiplications).
There is also another type of algorithm (known as a split radix) that has some advantages; namely, a split radix algorithm has fewer non-trivial complex multiplications than radix-2 and radix-4 algorithm. For example, for a 64-point FFT, radix-2, radix-4, and split-radix algorithms involve 98, 76, and 72 non-trivial complex multiplications respectively. In typical FFT architectures, the actual number of complex multiplications performed in radix-2 and radix-4 is even higher because multiplying by i (√{square root over (−1)}) should also be counted. Generally, the split radix algorithm successively decomposes an N-point Discrete Fourier Transform (DFT) into a
  N  2DFT and two
  N  4DFTs as follows:
                                              ⁢                                            X              ⁡                              [                                  2                  ⁢                                                                          ⁢                  k                                ]                                      =                                          ∑                                  n                  =                  0                                                                      N                    2                                    -                  1                                            ⁢                                                          ⁢                                                (                                                            x                      ⁡                                              [                        n                        ]                                                              +                                          x                      ⁡                                              [                                                  n                          +                                                      N                            2                                                                          ]                                                                              )                                ⁢                                  W                  N                                      2                    ⁢                                                                                  ⁢                    nk                                                                                ;                                    (        2        )                                                      X            ⁡                          [                                                4                  ⁢                                                                          ⁢                  k                                +                1                            ]                                =                                    ∑                              n                =                0                                                              N                  4                                -                1                                      ⁢                                                  ⁢                                          [                                                      (                                                                  x                        ⁡                                                  [                          n                          ]                                                                    -                                              x                        ⁡                                                  [                                                      n                            +                                                          N                              2                                                                                ]                                                                                      )                                    -                                      j                    ⁡                                          (                                                                        x                          ⁡                                                      [                                                          n                              +                                                              N                                4                                                                                      ]                                                                          -                                                  x                          ⁡                                                      [                                                          n                              +                                                                                                3                                  ⁢                                                                                                                                          ⁢                                  N                                                                4                                                                                      ]                                                                                              )                                                                      ]                            ⁢                              W                N                n                            ⁢                              W                N                                  4                  ⁢                                                                          ⁢                  nk                                                                    ;        and                            (        3        )                                                      X            ⁡                          [                                                4                  ⁢                                                                          ⁢                  k                                +                3                            ]                                =                                    ∑                              n                =                0                                                              N                  4                                -                1                                      ⁢                                                  ⁢                                          [                                                      (                                                                  x                        ⁡                                                  [                          n                          ]                                                                    -                                              x                        ⁡                                                  [                                                      n                            +                                                          N                              2                                                                                ]                                                                                      )                                    +                                      j                    ⁡                                          (                                                                        x                          ⁡                                                      [                                                          n                              +                                                              N                                4                                                                                      ]                                                                          -                                                  x                          ⁡                                                      [                                                          n                              +                                                                                                3                                  ⁢                                                                                                                                          ⁢                                  N                                                                4                                                                                      ]                                                                                              )                                                                      ]                            ⁢                              W                N                                  3                  ⁢                                                                          ⁢                  n                                            ⁢                              W                N                                  4                  ⁢                                                                          ⁢                  nk                                                                    ,                                  ⁢                                  ⁢        where                            (        4        )                                                          ⁢                              W            N            k                    =                                    ⅇ                              -                                                      2                    ⁢                    π                    ⁢                                                                                  ⁢                    ki                                    N                                                      .                                              (        5        )            To realize the split radix algorithm in hardware, though, “L-shaped” butterflies (as shown, for example, in FIG. 5) are traditionally employed. The shape of these “L-shaped” butterflies, though, results in irregular scheduling due mainly to uneven latency between datapaths as shown in FIGS. 6 and 7 (which are flow graphs depicting a 16-point variable geometry architecture). Thus, the conventional split radix architecture and algorithm are ill-suited for high throughput applications.
Therefore, there is a need for an improved FFT architecture and algorithm.
Some examples of conventional systems are: Lin, et al., “A 1-GS/s FFT/IFFT processor for UWB applications,” IEEE Journal of Solid-State Circuits, vol. 40, No. 8, pp. 1726-1735, August 2005; Tang, et al., “A 2.4-GS/s FFT Processor for OFDM-Based WPAN Applications,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 57, no. 6, pp. 451-455, June 2010; Cho, et al., “A high-speed low-complexity modified radix-25 FFT processor for gigabit WPAN applications,” in IEEE International Symposium on Circuits and Systems, May 2011, pp. 1259-1262; Huang et al., “A green FFT processor with 2.5-GS/s for IEEE 802.15.3c (WPANs),” in 2International Conference on Green Circuits and Systems, June 2010, pp. 9-13; M. C. Pease, “An adaptation of the Fast Fourier Transform for parallel processing,” Journal of the ACM, vol. 15, pp. 252-264, April 1968; Duhamel et al., “‘Split Radix’ FFT Algorithm,” Electronics Letters, Vol. 20, No. 1, pp. 14-16, 5 1984; [7] M. Corinthios, “The design of a class of Fast Fourier Transform computers,” IEEE Transactions on Computers, vol. C-20, no. 6, pp. 617-623, June 1971; Argello et al., “Constant geometry split-radix algorithms,” Journal of VLSI Signal Processing, 1995; Sorensen et al., “Real-valued fast Fourier transform algorithms,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 35, no. 6, pp. 849-863, June 1987; and R. Matusiak. (2001, August) Implementing Fast Fourier Transform algorithms of real-valued sequences with the TMS320 DSP platform.