1. Field of the Invention
This invention relates to a memory-based FFT/IFFT processor and design method for general sized memory-based FFT processor to minimize the area and reduce the necessary clock rate.
2. Description of the Related Art
A. Prior Arts List:
1. USA Patent:
Pat. No.Title[A1]4,477,878Discrete Fourier transform with non-tumbledoutput[A2]5,091,875Fast Fourier transform (FFT) addressingapparatus and method[A3]7,062,523Method for efficiently computing a fast Fouriertransform[A4]7,164,723Modulation apparatus using mixed-radix fastFourier transform[A5]20060253514Memory-based Fast Fourier Transform device(Publication No.)[A6]20080025199Method and device for high throughput n-point(Publication No.)forward and inverse fast Fourier transform2. China Patent:
Pat. No.Title[A7]01140060.9The architecture for 3780-point DFT processor[A8]03107204.6The multicarrier systems and method with3780-point IDFT/DFT processor[A9]200410090873.2The oversampling method for 3780-point DFT(Publication No.)[A10]200610104144.7The 3780-point DFT processor(Publication No.)[A11]200710044716.1The water-flowed 3780-point FFT processor(Publication No.)
3. Articles    [B1] Z.-X. Yang, Y.-P. Hu, C.-Y. Pan, and L. Yang, “Design of a 3780-point IFFT processor for TDS-OFDM,” IEEE Trans. Broadcast., vol. 48, no. 1, pp. 57-61, March 2002.    [B2] L. G. Johnson “Conflict free memory addressing for dedicated FFT hardware,” IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 39, no. 5, pp. 312-316, May 1992.    [B3] B. G. Jo, and M. H. Sunwoo, “New continuous-flow mixed-radix (CFMR) FFT processor using novel in-place strategy,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 52, no.5, pp. 911-919, May 2005.
B. Description of Prior Arts
(1) [A1] could not support the multi-bank memory structure. Hence, for the radix-r computation, it would cost r clock cycles to read data from memory and write the computed data back to memory. This would result in the FFT needs more computation cycle and thus demand higher clock rate for the processor for real-time application. This invention could solve this problem by supporting the multi-bank addressing without memory conflict such that the r data for radix-r could be accessed in one clock cycle.
(2) [A2], [A5], [B2] could only support the fixed radix-r. Hence, it could only be applied in the FFT with the size N=r If we consider the application that the 3780-point FFT for the Chinese DTV application or the 3072-point FFT for the PLC application, they would not work here. This invention could support any general mixed radix such that it could work for any size FFT application.
(3) [A3] could only support the fixed radix-r, hence, it could not support the Chinese DTV or PLC etc application. Besides, since it could not support multi-bank memory structure, it would need r clock cycles for data access from memory for radix-r computation. Then, it would need higher clock rate for FFT computation than the processor that takes multi-bank memory structure. This invention could not only support variable radix for any size FFT application but also support multi-bank memory structure to reduce necessary clock rate without memory conflict.
(4) [A4], [B3] could only support the radix-2/4 algorithm. Therefore, it could only work for the FFT with the size N=2n. And for the FFT application with the other size such as for the Chinese DTV with N=3780, it would not work. However, this invention could work for all of them since this invention could support any mixed radix. Besides, for the long size FFT processor design such as N=8192, this invention could make the processor design more flexible since the max radix [4] could support is only radix-4 and this invention could support is greater than radix-4.
(5) [A6] describes some candidate decomposition of 3780, for instance, 3780=3×3×3×2×2×5×7. It implements each small size FFT module with the MDC structure to eliminate some large internal buffer in [A7]-[A11]. But, this would cost more hardware since it should finish all the computation within one clock cycle for each module. Besides, for real system application, the in order output data is necessary. However, its output data are not in order output.
(6) [A7], [A8], [A9], [A10], [A11] implement the 3780-point FFT processor with some architecture which is similar to pipeline. Their architecture need large internal buffer to reorder the data for processing. Besides, for the real system application requirement, the in order I/O data and to support continuous data flow are both necessary. In order to achieve this, [A7], [A8] needs at least 3N words; [A9], [A11] needs at least 5N words; [A10] needs 6N words memory size. This invention could achieve this requirement only with 2N memory words. Note that, the output data of [A7], [A8] and [A11] are not in order output, and they need “at least” one N words memory to reorder the output data as in order output.
(7) The output data of the 3780-point FFT processor proposed in [B1] is not in order. In order to achieve this, it would need a buffer to reorder the output data. Hence, the design would need memory size more than 4N words to achieve continuous data flow and in order I/O data. However, this invention only needs 2N words memory.
Classification of the Prior arts[A1][A2] [B2] [A5][A3][A4] [B3]Radix *1All generalFixed radix-rFixed radix-rRadix-2/4In-place *2YesYesYesYesMemory *32N words2N words2N words2N wordsMulti-bank *4NoYesNoYes*1 The Tab “Radix” means that the butterfly size that the prior patents could support.*2 The Tab “In-place” means that whether the prior patents support the in-place policy to reduce the necessary memory size.*3 The Tab “Memory” means that the necessary memory size for the real-time application of the prior patents.*4 The Tab “Multi-bank” means that whether the prior patent could support the multi-bank memory structure to reduce the necessary clock rate.