In an orthogonal frequency-division multiplexing (OFDM) communication system, the high-rate signal is transformed into a number of orthogonal components for lower rate processing. This is usually achieved by using a Fast Fourier Transform and Inverse Fast Fourier Transform (FFT/IFFT) pair. The Fast Fourier Transform and Inverse Fast Fourier Transform are frequently applied in communications systems due to their efficiency in OFDM applications.
In the OFDM based ultra wide-band (UWB) system proposed by the MultiBand OFDM Alliance (MBOA), a 128-point IFFT is used in the transmitter to map 128-point frequency-domain complex values to a time-domain OFDM symbol. This is described in the publication by the MultiBand OFDM Alliance (MBOA) Special Interest Group (SIG)/WiMedia Alliance, Inc. (WiMedia) in www.wimedia.com entitled “MultiBand OFDM Physical Layer Proposal for IEEE 802.15.3a,” September 2004. In the receiver side of such a conventional system, a 128-point FFT is performed to convert each time-domain OFDM symbol back into 128 frequency-domain complex values.
For many applications, the FFT/IFFT processor can adopt a parallel architecture to satisfy the requirement of high-throughputs and low latencies. In a conventional OFDM UWB system, for example, the 128-point FFT/IFFT should be computed within the period of one OFDM symbol—312.5 ns. With the processing clock of 132 MHz, which is selected with practical interest, the 128-point processing should be completed within about 41 clock cycles. In this case, as many parallel processing elements as necessary to achieve the fast speed may be employed. However, this will greatly increase the hardware complexity which is not generally acceptable and should be avoided in practice.
A fundamental computational element of the FFT is a “butterfly element” which, in its simplest form (radix-2) transforms two complex values into two other complex values. The butterfly element is used to perform multiple calculations in the different stages of the transform resulting in the synthesis from the time domain to the frequency domain or vice versa.
Various pipeline techniques have been proposed over the last three decades for achieving real-time FFT/IFFT processing. These include the R2MDC (Radix-2 Multi-path Delay Commutator), R2SDF (Radix-2 Single-path Delay Feedback), R22SDF, and other Radix-4 based techniques. An overview and comparison of these techniques are described in the publication by S. He and M. Torkelson entitled “Designing pipeline FFT processor for OFDM (de)modulation,” which was published in Proc. IEEE URSI Int Symp. Signals, Syst., Electron., September 1998, pp. 257-262. In general, the aim behind these techniques is to reduce the implementation complexity as much as possible while real-time, non-stop processing of the input data sequence is maintained. In this context, real-time pipeline processing means that the processing clock rate is equal to the sampling rate. However, in the OFDM UWB system, the sampling rate is prescribed as 528 MHz, which is much higher than the processing clock rate of 132 MHz. This problem may be solved by introducing the parallel and pipeline processing together such that the integrated processing has the advantages of both architectures. In this way, a good compromise between processing speed and implementation complexity can be achieved.
A basic parallel-pipeline FFT processor which is pipelined with the R2SDF scheme is described in the publication by E. H. Wold and A. M. Despain entitled “Pipeline and parallel-pipeline FFT processors for VLSI implementation,” published as IEEE Trans. Comput., vol. C-33, no. 5, pp. 414-426, May 1984.
To meet the requirement of very high throughput in the OFDM UWB system, a three-stage parallel pipelined architecture is described in the publication by H.-Y. Liu, et al., entitled “A 480 Mb/s LDPC-COFDM-based UWB baseband transceiver,” published as IEEE 1SSCC Dig. Tech. Papers, pp. 444-445, Febuary, 2005. In this publication, a four-parallel architecture is adopted to implement the radix-2 FFT algorithm in Stage 1 and a radix-8 FFT algorithm is implemented using two different structures in Stage 2 and Stage 3.
There is a need for effective and efficient VLSI implementation of the IFFT/FFT processor in an OFDM UWB system. As the OFDM UWB system is targeted to provide very high data rate communication, the IFFT/FFT processor needs to satisfy the requirement of high-throughputs and low latencies, as well as being economical and having low power consumption whilst at the same time being high speed and area efficient.