A Fast Fourier Transformation (FFT) is an efficient algorithm to compute the Discrete Fourier Transform (DFT) and its inverse. Let x(0), . . . ,x(N−1) be N complex numbers. The DFT is defined by the formula
      X    ⁡          (      k      )        =                    ∑                  i          =          0                          N          -          1                    ⁢                          ⁢                        x          ⁡                      (            i            )                          ⁢                  ⅇ                                    -              j                        ⁢                                                  ⁢            2            ⁢            π            ⁢                                                  ⁢            ki            ⁢                          /                        ⁢            N                                =                  ∑                  i          =          0                          N          -          1                    ⁢                          ⁢                        x          ⁡                      (            i            )                          ⁢                  W          N          ik                    k=0, . . . ,N−1where WNk=e−jk(2π/N). Evaluating this definition directly requires O(N2) operations: there are N outputs X(k), and each output requires a sum of N terms. An FFT is any method to compute the same results in O(N log N) operations.
(The Cooley-Tukey Algorithm)
The Cooley-Tukey algorithm is the most common FFT algorithm. It re-expresses the DFT of an arbitrary composite size N=N1N2 in terms of smaller DFTs of sizes N1 and N2, recursively, in order to reduce the computation time to O(N log N) for highly-composite N (smooth numbers).
A radix-2 decimation-in-time (DIT) FFT is the simplest and most common form of the Cooley-Tukey algorithm. A radix-2 DIT divides a DFT of size N into two interleaved DFTs (hence the name “radix-2”) of size N/2 with each recursive stage. A radix-2 DIT first computes the Fourier transforms of the even-indexed numbers x(2m) (x(0),x(2), . . . ,x(N−2)) and of the odd-indexed numbers x(2m+1) (x(1),x(3), . . . ,x(N−1)), and then combines those two results to produce the Fourier transform of the whole sequence.
More explicitly, let us denote the DFT of the even-indexed numbers x(2m) by XE(k), and the DFT of the odd-indexed numbers x(2m+1) by XO(k), then it follows:
                    X        E            ⁡              (        k        )              =                            ∑                      m            =            0                                              N              ⁢                              /                            ⁢              2                        -            1                          ⁢                                  ⁢                              x            ⁡                          (                              2                ⁢                m                            )                                ⁢                      W                          N              /              2                        km                              =                        ∑                      m            =            0                                              N              ⁢                              /                            ⁢              2                        -            1                          ⁢                                  ⁢                              x            ⁡                          (                              2                ⁢                m                            )                                ⁢                      W            N                          2              ⁢              km                                                              X        O            ⁡              (        k        )              =                            ∑                      m            =            0                                              N              ⁢                              /                            ⁢              2                        -            1                          ⁢                                  ⁢                              x            ⁡                          (                                                2                  ⁢                  m                                +                1                            )                                ⁢                      W                          N              ⁢                              /                            ⁢              2                        km                              =                        ∑                      m            =            0                                              N              ⁢                              /                            ⁢              2                        -            1                          ⁢                                  ⁢                              x            ⁡                          (                                                2                  ⁢                  m                                +                1                            )                                ⁢                      W            N                          2              ⁢              km                                          where WN/2=WN2=e−j4π/N. Thus, the DFT X(k) for the original data sequence x(i)is represented by:
      X    ⁡          (      k      )        =                              ∑                      m            =            0                                              N              ⁢                              /                            ⁢              2                        -            1                          ⁢                                  ⁢                              x            ⁡                          (                              2                ⁢                m                            )                                ⁢                      W            N                          2              ⁢              km                                          +                        ∑                      m            =            0                                              N              ⁢                              /                            ⁢              2                        -            1                          ⁢                                  ⁢                              x            ⁡                          (                                                2                  ⁢                  m                                +                1                            )                                ⁢                      W            N                          k              ⁡                              (                                                      2                    ⁢                    m                                    +                  1                                )                                                          =                            X          E                ⁡                  (          k          )                    +                        W          N          k                ⁢                                            X              O                        ⁡                          (              k              )                                .                    The radix-2 DIT FFT is achieved by applying the above procedures to each of XE(k) and XO(k) recursively.
FIG. 1 illustrates a signal flow diagram of radix-2 DIT-FFT (N=16). As shown in FIG. 1, the DIT-FFT includes a bit-reverse operation and a plurality of butterfly operations.
The bit-reverse operation is an operation of permutating input data sequences x(0), . . . ,x(N−1). During the permutation, input data sequences are divided into the even-indexed data sequences x(0),x(2), . . . ,x(N−2) and the odd-indexed data sequences x(1),x(3), . . . ,x(N−1), and then the odd-indexed data sequences are concatenated to the even-indexed data sequences. That is, after this concatenation, concatenated data sequences x(0),x(2), . . . ,x(N−2),x(1),x(3), . . . ,x(N−1) are generated. Next, similar operations are recursively executed for each of the first half and the second half of the concatenated data sequences. The permutation described here corresponds to reordering the input data sequences so that a data sequence whose index is represented by (bn−1,bn−2, . . . ,b2,b1,b0) in binary representation is permutated to a position of (b0,b1,b2, . . . ,bn−2,bn−1). For this reason, this permutation is called “bit-reverse” operation. FIG. 2 illustrates an example of bit-reverse operation (N=128). For example, in FIG. 1, each pair of input data sequences such as {x(1),x(8)}((1)10=(0,0,0,1)2, (8)10=(1,0,0,0)2), {x(2),x(4)} ((2)10=(0,0,1,0)2, (4)10=(0,1,0,0)2), {x(3),x(12)} ((3)10=(0,0,1,1)2, (12)10=(1,1,0,0)2) . . . are permutated with each other in the bit-reverse operation.
FIG. 3 illustrates a butterfly operation for a radix-2 DIT FFT. In the butterfly operation of FIG. 3, output data P and Q are computed using input data A and B with a predetermined coefficient W by the following formula:P=A+WB Q=A−WB where the definition of W has been already described above. As shown in FIG. 1, the given FFT includes a number of above butterfly operations. For example, in the Stage #0 operation in FIG. 1, it follows:x(1)(0)=x(0)+W0x(8), x(1)(8)=x(0)−W0x(8),x(1)(4)=x(4)+W0x(12), x(1)(12)=x(4)−W0x(12),x(1)(2)=x(2)+W0x(10), x(1)(10)=x(2)−W0x(10),x(1)(6)=x(6)+W0x(14), x(1)(14)=x(6)−W0x(14),x(1)(1)=x(1)+W0x(9), x(1)(9)=x(1)−W0x(9),x(1)(5)=x(5)+W0x(13), x(1)(13)=x(5)−W0x(13),x(1)(3)=x(3)+W0x(11), x(1)(11)=x(3)−W0x(11),andx(1)(7)=x(7)+W0x(15), x(1)(15)=x(7)−W0x(15),
As is well known to those skilled in the art, the FFT may be implemented in many other forms. For example, one may implement the radix-2 FFT by decimating sample data sequences in frequency instead of decimating them in time. FIG. 4 illustrates a signal flow diagram of radix-2 decimation-in-frequency (DIF) FFT (N=16). As shown in FIG. 4, the DIF FFT also includes a bit-reverse operation and a plurality of butterfly operations, while the butterfly operation of DIF-FFT is illustrated by FIG. 5.
The butterfly operations can easily be parallelized with a parallelization factor linear with respect to the order of resources. But it is difficult to parallelize the bit-reverse operations with the standard implementation.
The bit-reverse operation in FIG. 4 may be relocated as in FIG. 6. In this case, the coefficient for the butterfly operation shall be shuffled to the bit-reverse format as in FIG. 7. This method is frequently used in DSP (Digital Signal Processor) software since the coefficients can easily be fetched from the array without skipping unnecessary elements. The coefficients for Stage #n are represented as the first half coefficients for Stage #n−1.
One may divide the bit-reverse operation into a plurality of bit-swap operations, and execute the bit-swap operations between the butterfly operations. FIG. 8 illustrates the swapping of two index bits every Stage (N=128). FIG. 9 illustrates a signal flow diagram of radix-2 DIF-FFT (N=16), which implements the bit-swap operations. In FIG. 9, before executing the butterfly operation of Stage #0, the input data sequences are permutated so that the MSB (Most Significant Bit) and LSB (Least Significant Bit) of the index bits are swapped. After executing the butterfly operation of Stage #0, the data sequences are further permutated so that the second MLB and the second MSB of the index bits are swapped, and input into the butterfly operation of Stage #1. For example, in FIG. 9, each pair of input data sequences such as {x(1),x(8)} ((1)10=(0,0,0,1)2, (8)10=(1,0,0,0)2), {x(3),x(10)} ((3)10=(0,0,1,1)2, (10)10=(1,0,1,0)2), {x(5),x(12)} ((5)10=(0,1,0,1)2, (12)10=(1,1,0,0)2 . . . are permutated with each other in the bit-swap operation before the butterfly operation of Stage #0. Similarly, each pair of data sequences such as {x(1)(2),x(1)(4)} ((2)10=(0,0,1,0)2, (4)10=(0,1,0,0)2), {x(1)(10),x(1)(12)} ((10)10=(1,0,1,0)2, (12)10=(1,1,0,0)2), {x(1)(3),x(1)(5)} ((3)10=(0,0,1,1)2, (5)10=(0,1,0,1)2), and {x(1)(11),x(1)(13)} ((11)10=(1,0,1,1)2, (13)10=(1,1,0,1)2) are permutated with each other in the bit-swap operation before the butterfly operation of Stage #1.
The radix-2 butterfly operations are performed N/2 times in one stage. This processing is repeated for log2 N stages. Therefore, Cooley-Tukey algorithm requires
            N      ⁢                          ⁢              log        2            ⁢                          ⁢      N        2    =      O    ⁡          (              N        ⁢                                  ⁢        log        ⁢                                  ⁢        N            )      butterfly operations in addition to the bit-reverse operation to complete the DFT.
(The Stockham Algorithm)
FIG. 10 illustrates the array interpretation according to the Stockham autosort algorithm (See Charles Van Loan, “Computational Frameworks for the Fast Fourier Transform,” 1991, Society for Industrial and Applied Mathematics). In the Stockham algorithm, each data sequence is associated with each element of a two dimensional 2βs×αs array x(s)(j,k), where αs=2L, βs=2R−s−2, and N=2βsαs=2R−1.
x(s)(j,k) is calculated using the radix-2 butterfly operations as follows.x(0)(j,0)=X(j)x(s+1)(j,k)=x(s)(j,k)+WNkβsx(s)(j+βs,k), andx(s+1)(j,k+αs)=x(s)(j,k)−WNkβsx(s)(j+βs,k),where j=0, 1, . . . , βs−1, k=0, 1, . . . , αs−1, then the DFT result is acquired:X(k)=x(R−1)(0,k), N=2R−1.
The radix-2 butterfly operations are performed N/2 times in one stage and this processing is repeated for log2 N stages. Therefore, the Stockham algorithm requires
            N      ⁢                          ⁢              log        2            ⁢                          ⁢      N        2    =      O    ⁡          (              N        ⁢                                  ⁢        log        ⁢                                  ⁢        N            )      butterfly operations to complete the DFT X(0), . . . ,X(N−1). The bit-reverse operations are implicitly performed during the butterfly operations and thus extra calculation time for the bit-reverse operation is not required.
FIG. 11 schematically shows an example of data handling applied for N=16 points (that is, p=4) FFT. As is well-known to those skilled in the art, the Stockham algorithm is also represented as a combination of bit-reverse operations and butterfly operations. FIG. 12 shows how bit-reverse is performed for each Stage. Two rows are fetched from the memory and the butterfly operations are performed for each element. Next, the result is stored to the memory after multiplexing the two streams as one row.
(Radix-n Algorithm)
Cooley-Tukey algorithm can be applied to a radix-n DIT-FFT. For the radix-2 FFT, the Cooley-Tukey algorithm computes the DFTs for even-indexed numbers x(2m) and odd-indexed numbers x(2m+1). Instead, the algorithm computes the DFTs for numbers x(nm), x(nm+1), . . . and x(nm+(n−1)) for the radix-n FFT.
For radix-n DIT-FFT, digit-reverse processing is performed for permutating input data sequences in a base-n number. The digit-reverse is defined as the expansion of the binary (base-2) bit-reverse to the general base number.
In the digit-reverse operation, the input data sequences are permutated so that the MSD (Most Significant Digit) and LSD (Least Significant Digit) of the index digits are swapped in the same way as in bit-reverse.
(Mixed Radix Algorithm)
The radix may be varied for each stage. The digit-reverse for the mixed radix may be performed using the mixed base number. The mixed base number means that the base is different for each digit.
As an example, a 30 point DIF-FFT with mixed radix of 2, 3 and 5 will be explained hereinafter.
A 30 point DFT is defined as follows.
      X    ⁡          (      k      )        =            ∑              n        =        0            29        ⁢                  ⁢                  x        ⁡                  (          n          )                    ⁢              W        30        nk            where WNk=e−j2(2π/N).
Here we define as:
                    X                  E          ⁢                                          ⁢          0                    ⁡              (        k        )              =                  ∑                  m          =          0                4            ⁢                          ⁢                        x          ⁡                      (                                          2                ·                3                            ⁢              m                        )                          ⁢                  W          5          mk                                        X                  E          ⁢                                          ⁢          1                    ⁡              (        k        )              =                  ∑                  m          =          0                4            ⁢                          ⁢                        x          ⁡                      (                          2              ⁢                              (                                                      3                    ⁢                    m                                    +                  1                                )                                      )                          ⁢                  W          5          mk                                        X                  E          ⁢                                          ⁢          2                    ⁡              (        k        )              =                  ∑                  m          =          0                4            ⁢                          ⁢                        x          ⁡                      (                          2              ⁢                              (                                                      3                    ⁢                    m                                    +                  2                                )                                      )                          ⁢                  W          5          mk                                        X                  O          ⁢                                          ⁢          0                    ⁡              (        k        )              =                  ∑                  m          =          0                4            ⁢                        x          ⁡                      (                                                            2                  ·                  3                                ⁢                m                            +              1                        )                          ⁢                  W          5          mk                                        X                  O          ⁢                                          ⁢          1                    ⁡              (        k        )              =                  ∑                  m          =          0                4            ⁢                          ⁢                        x          ⁡                      (                                          2                ⁢                                  (                                                            3                      ⁢                      m                                        +                    1                                    )                                            +              1                        )                          ⁢                  W          5          mk                                        X                  O          ⁢                                          ⁢          2                    ⁡              (        k        )              =                  ∑                  m          =          0                4            ⁢                          ⁢                        x          ⁡                      (                                          2                ⁢                                  (                                                            2                      ⁢                      m                                        +                    2                                    )                                            +              1                        )                          ⁢                  W          5          mk                    
Assume
                    X        E            ⁡              (        k        )              =                            ∑                      m            =            0                    14                ⁢                                  ⁢                              x            ⁡                          (                              2                ⁢                m                            )                                ⁢                      W            15            mk                    ⁢                                          ⁢          and          ⁢                                          ⁢                                    X              O                        ⁡                          (              k              )                                          =                        ∑                      m            =            0                    14                ⁢                                  ⁢                              x            ⁡                          (                                                2                  ⁢                  m                                +                1                            )                                ⁢                      W            15            mk                                ,then it follows:
                                                        X              E                        ⁡                          (              k              )                                =                                    ∑                              m                =                0                            14                        ⁢                                                  ⁢                                          x                ⁡                                  (                                      2                    ⁢                    m                                    )                                            ⁢                              W                15                mk                                                                                  =                                                    ∑                                  m                  =                  0                                4                            ⁢                                                          ⁢                                                x                  ⁡                                      (                                                                  2                        ·                        3                                            ⁢                      m                                        )                                                  ⁢                                  W                  15                                      3                    ⁢                    mk                                                                        +                                          ∑                                  m                  =                  0                                4                            ⁢                                                          ⁢                                                x                  ⁡                                      (                                          2                      ⁢                                              (                                                                              3                            ⁢                            m                                                    +                          1                                                )                                                              )                                                  ⁢                                  W                  15                                                            (                                                                        3                          ⁢                          m                                                +                        1                                            )                                        ⁢                    k                                                                        +                                          ∑                                  m                  =                  0                                4                            ⁢                                                          ⁢                                                x                  ⁡                                      (                                          2                      ⁢                                              (                                                                              3                            ⁢                            m                                                    +                          2                                                )                                                              )                                                  ⁢                                  W                  15                                                            (                                                                        3                          ⁢                          m                                                +                        2                                            )                                        ⁢                    k                                                                                                                    =                                                    ∑                                  m                  =                  0                                4                            ⁢                                                          ⁢                                                x                  ⁡                                      (                                                                  2                        ·                        3                                            ⁢                      m                                        )                                                  ⁢                                  W                  5                  mk                                                      +                                          W                5                k                            ⁢                                                ∑                                      m                    =                    0                                    4                                ⁢                                                                  ⁢                                                      x                    ⁡                                          (                                              2                        ⁢                                                  (                                                                                    3                              ⁢                              m                                                        +                            1                                                    )                                                                    )                                                        ⁢                                      W                    5                    mk                                                                        +                                          W                5                                  2                  ⁢                  k                                            ⁢                                                ∑                                      m                    =                    0                                    4                                ⁢                                                                  ⁢                                                      x                    ⁡                                          (                                              2                        ⁢                                                  (                                                                                    3                              ⁢                              m                                                        +                            2                                                    )                                                                    )                                                        ⁢                                      W                    5                    mk                                                                                                                    =                                                    X                                  E                  ⁢                                                                          ⁢                  0                                            ⁡                              (                k                )                                      +                                          W                5                k                            ⁢                                                X                                      E                    ⁢                                                                                  ⁢                    1                                                  ⁡                                  (                  k                  )                                                      +                                          W                5                                  2                  ⁢                  k                                            ⁢                                                X                                      E                    ⁢                                                                                  ⁢                    2                                                  ⁡                                  (                  k                  )                                                                                                                                X              O                        ⁡                          (              k              )                                =                                    ∑                              m                =                0                            14                        ⁢                                                  ⁢                                          x                ⁡                                  (                                                            2                      ⁢                      m                                        +                    1                                    )                                            ⁢                              W                15                mk                                                                                  =                                                    ∑                                  m                  =                  0                                4                            ⁢                                                          ⁢                                                x                  ⁡                                      (                                                                                            2                          ·                          3                                                ⁢                        m                                            +                      1                                        )                                                  ⁢                                  W                  15                                      3                    ⁢                    nk                                                                        +                                          ∑                                  m                  =                  0                                4                            ⁢                                                          ⁢                                                x                  ⁡                                      (                                                                  2                        ⁢                                                  (                                                                                    3                              ⁢                              m                                                        +                            1                                                    )                                                                    +                      1                                        )                                                  ⁢                                  W                  15                                                            (                                                                        3                          ⁢                          m                                                +                        1                                            )                                        ⁢                    k                                                                        +                                          ∑                                  m                  =                  0                                4                            ⁢                                                          ⁢                                                x                  ⁡                                      (                                                                  2                        ⁢                                                  (                                                                                    2                              ⁢                              m                                                        +                            2                                                    )                                                                    +                      1                                        )                                                  ⁢                                  W                  15                                                            (                                                                        3                          ⁢                          m                                                +                        2                                            )                                        ⁢                    k                                                                                                                    =                                                    ∑                                  m                  =                  0                                4                            ⁢                                                          ⁢                                                x                  ⁡                                      (                                                                                            2                          ·                          3                                                ⁢                        m                                            +                      1                                        )                                                  ⁢                                  W                  5                  mk                                                      +                                          W                5                k                            ⁢                                                ∑                                      m                    =                    0                                    4                                ⁢                                                                  ⁢                                                      x                    ⁡                                          (                                                                        2                          ⁢                                                      (                                                                                          3                                ⁢                                m                                                            +                              1                                                        )                                                                          +                        1                                            )                                                        ⁢                                      W                    5                    mk                                                                        +                                          W                5                                  2                  ⁢                  k                                            ⁢                                                ∑                                      m                    =                    0                                    4                                ⁢                                                                  ⁢                                                      x                    ⁡                                          (                                                                        2                          ⁢                                                      (                                                                                          2                                ⁢                                m                                                            +                              2                                                        )                                                                          +                        1                                            )                                                        ⁢                                      W                    5                    mk                                                                                                                    =                                                    B                                  O                  ⁢                                                                          ⁢                  0                                            ⁡                              (                k                )                                      +                                          W                5                k                            ⁢                                                B                                      O                    ⁢                                                                                  ⁢                    1                                                  ⁡                                  (                  k                  )                                                      +                                          W                5                                  2                  ⁢                                                                          ⁢                  k                                            ⁢                                                B                                      O                    ⁢                                                                                  ⁢                    2                                                  ⁡                                  (                  k                  )                                                                        
Accordingly, it follows:
      X    ⁡          (      k      )        =                              ∑                      m            =            0                    14                ⁢                                  ⁢                              x            ⁡                          (                              2                ⁢                m                            )                                ⁢                      W            30                          2              ⁢              km                                          +                        ∑                      m            =            0                    14                ⁢                                  ⁢                              x            ⁡                          (                                                2                  ⁢                  m                                +                1                            )                                ⁢                      W            30                          k              ⁡                              (                                                      2                    ⁢                    m                                    +                  1                                )                                                          =                            X          E                ⁡                  (          k          )                    +                        W          30          k                ⁢                              X            O                    ⁡                      (            k            )                              
FIG. 13A and 13B illustrate the signal flow diagram of above operations. As shown in FIGS. 13A and 13B, the DIT-FFT includes a digit-reverse operation and a plurality of butterfly operations. In these figures, we denote mixed-base number for the index of the input data as l2m3n5 and the index after the digit reverse as n5m3l2, where the base is denoted in the subscript for each digit here, that is, l is a base-5 (quinary) number, m is a base-3 (ternary) number, and n is a base-2 (binary) number.
The processing speed required for FFT is increasing year by year for high speed wireless communications utilizing OFDM (Orthogonal Frequency Division Multiplexing) such as LTE (Long Term Evolution), LTE advanced, WiMAX, WirelessHD and so on. However, the improvement of clock frequency on baseband chips has been slower than the required processing speed. Thus, the improvement of FFT speed by parallelization is required.
In the standard FFT algorithm, it is difficult (or costly) to parallelize bit-reverse operations. Normally the bit-reverse operations are performed element-by-element and this causes a bottleneck in the FFT.
In the Stockham autosort algorithm, the input and output buffers cannot be shared since the algorithm shuffles all data. This leads to doubling in the size of memory required for the calculation. For example, when computing a 2048 point FFT with a 32-bit complex value, the standard algorithm requires 8 KB of memory while Stockham autosort algorithm requires 16 KB of memory. The size of memory required is increasing as high-speed communications are deployed.