In general, an N-point FFT (Fast Fourier Transform) is expressed mathematically as [1]
                                          X            k                    =                                    ∑                              n                =                0                                            N                -                1                                      ⁢                                          x                n                            ⁢                              ⅇ                                                                            -                      ⅈ                                        ⁢                                                                                  ⁢                    2                    ⁢                    π                    ⁢                                                                                  ⁢                    nk                                    N                                                                    ,                            (        1        )            where xn is the nth element of a discrete time signal vector x=[x0 . . . xN-1] with N data samples, and Xk is the kth element of vector X=[X0 . . . XN-1] that corresponds to FFT of x. In general, both xn and Xk can be considered as complex numbers. Direct implementation of an FFT using (1) is always avoided due to its extremely high computation load especially when N is large. Thus, instead of directly implementing (1), an N-point FFT or IFFT (Inverse Fast Fourier Transform) is always implemented using stages of small-sized FFT or IFFT units. As an example, an 8-point FFT can be implemented using three stages each comprising four 2-point FFT operations.
Further, in order to reduce processing latency while increasing throughput of a device that implements FFT/IFFT operations, a number of processors are used in parallel. FIG. 1 shows a typical parallel-based mode of implementing an 8-point IFFT/FFT [1]. In FIG. 1, circled regions represent processors that operate in parallel. Of these regions, the shaded ones, “•”, represent multiplication by twiddle factor, while regions marked with un-shaded “∘” represents addition operations.
FFT has conventionally been implemented using DSP (Digital Signal Processor), parameterized ASIC (Application Specific Integrated Circuit), IP (intellectual property) cores, FPGA (Field Programmable Gate Array) and reconfigurable processors. It has been noted that when a processor is based on CORDIC (coordinate rotation digital computer) [2], the resulting FFT utilizes less hardware resources in comparison to MAC (Multiply and Accumulate) based processors. This is especially so when the size of FFT is large. Whereas there are hybrid processors based on a combination of CORDIC and ADDER units [3], or CORDIC and FFT/IFFT Kernel [2][P1], this patent consider processors which are purely based on CORDIC [4][5][6].
As an example, FIG. 2 shows a conventional FFT based on CORDIC for processing FFT of a vector with two data samples “s1” and “s2”, and generating two output data samples “S1” and “S2”. An intermediate signal S′2 is generated by a pair of CORDIC processors. This pair of CORDIC processors has been labeled I.
The intermediate signal S′2 is given byS′2=−2−0.5(s1−s2)  (2)
Therefore, output signals “S1” and “S2” are related to the input samples “s1” and “s2” by the following equations:S1=2−0.5(s1+s2)S2=2−0.5(s1−s2)  (3)From (1), FFT [S1′ S2′] of [s1 s2] is given byS1′=(s1+s2)S2′=(s1−s2)  (4)    [Non-Patent Document 1]    J. G. Proakis and D. G. Manolakis, Introduction to Digital Signal Processing, Maxwell Macmillan, 1989    [Non-Patent Document 2]    Y. H. Hu, “CORDIC-Based VLSI Architecture for Digital Signal Processing,” IEEE Signal Processing Magazine, pp. 16-July 1992    [Non-Patent Document 3]    R. Sarmineto, F. Tobajas, et. al, “A CORDIC Processor for FFT Computations and Its Implementation Using Gallium Arsenide Technology,” IEEE Trans. on VLSI Systems, pp. 18-30, vol. 6, no. 1, March 1998    [Non-Patent Document 4]    C. Ying, S. Chen and J. Chih, “Efficient CORDIC Designs for Multi-Mode OFDM FFT,” ICASSP Page(s): 1036-1039, vol. 3, May 2006    [Non-Patent Document 5]    B. Heyne, J. Gotze, “A Pure CORDIC based FFT for Reconfigurable Digital Signal Processing,” 12th European Signal Processing Conference (EUSIPCO2004), Vienna, Austria, 2004    [Non-Patent Document 6]    B. Heyne, J. Gotze, “CORDIC-Based algorithms for software defined Radio (SDR) baseband Processing,” Adv. Radio Sci., 4, 179-184, 2006    [Patent Document 1]    Kulkarni et al., “Reconfigurable Vector-FFT/IFFT, Vector-Multiplier/Divider,” U.S. Pat. No. 7,082,451 B2