1. Field of the Invention
The present invention relates generally to an apparatus for modulating data, and in particular, to a modulation apparatus based on orthogonal frequency division multiplexing (hereinafter referred to as “OFDM”) technology or discrete multi-tone (hereinafter referred to as “DMT”) technology.
2. Description of the Related Art
Generally, in a digital data communication system, data is modulated before being transmitted and demodulated after being received. Such modulation and demodulation are performed by a MODEM (modulator-demodulator) whose structure may vary according to its modulation scheme. Typically, modulation schemes used for data communication include code division multiplexing (CDM), frequency division multiplexing (FDM), OFDM, and DMT schemes.
A description will now be made of the OFDM and DMT modulation schemes herein below.
The OFDM scheme has been proposed for high-speed data transmission over a multi-path channel in a wireless communication system. Before the OFDM scheme had been proposed, a single carrier transmission scheme was used for data transmission. That is, a wireless communication system, using a modulation scheme preceding the OFDM scheme, modulates serial transmission data and then transmits each modulated symbol by using the entire frequency band of the channel. The OFDM scheme or the DMT scheme serial-to-parallel-coverts modulated data into as many data symbols as the number of subcarriers, and modulates the converted data symbols with corresponding subcarriers. Such modulation using subcarriers is realized by using a discrete Fourier transform (hereinafter referred to as “DFT”). However, for actual hardware design, modulation using subcarriers is realized by using a fast Fourier transform (hereinafter referred to as “FFT”) algorithm rather than a DFT or inverse discrete Fourier transform (hereinafter referred to as “IDFT”) algorithm, in order to reduce the number of calculations (or operations). A processor for processing the FFT algorithm has a high complexity and requires high-speed calculation when it is applied to an OFDM system. Therefore, it is hard to realize the processor for processing the FFT algorithm.
An FFT processor having a pipe line structure is chiefly used in a field where high-speed calculation is required. However, the pipe line structure requires as many calculators as the number of stages, so an increase in number of points causes an increase in its hardware size. Therefore, in order to solve the problems associated with an increase in hardware size, processors using a memory structure and a single butterfly calculator have been introduced.
A memory-based FFT processor using a radix-2 FFT algorithm is a typical example of such processors. Since the memory-based FFT processor can apply the radix-2 algorithm to a memory structure, it can minimize the number of multipliers. Therefore, the memory-based FFT processor can be used in realizing a small-sized FFT processor.
However, the memory-based FFT processor using the radix-2 algorithm requires many calculation cycles, increasing a calculation time. Therefore, the memory-based radix-2 FFT processor is not suitable to an OFDM system or DMT system which requires high-speed calculation, and in order to satisfy the high-speed calculation requirement, the memory-based radix-2 FFT processor requires a very high operating frequency. Thus, in the OFDM system or DMT system, a radix-4 algorithm is generally used instead of the radix-2 algorithm. A description will now be made of an existing FFT processor based on the radix-4 algorithm.
FIG. 1 is a block diagram illustrating a radix-4 algorithm-based FFT processor introduced by Amphion Co. Compared with the radix-2 algorithm, the radix-4 algorithm halves the number of stages, and also halves the number of butterfly calculations per stage. Therefore, the radix-4 algorithm is much smaller than the radix-2 algorithm in number of butterfly calculations. Shown in Table 1 below is the number of calculations of the radix-2 algorithm, the radix-4 algorithm and a mixed-radix algorithm which will described later, according to an FFT length.
TABLE 1FFT lengthRadix-2Radix-4Mixed-radix2561,024256—5122,304—6401,0245,1201,280—2,04811,264—3,0724,09624,5766,144—8,19253,248—14,336
As illustrated in Table 1, the radix-4 algorithm is available for FFT calculation for only FFT lengths of 4n (where n is an integer), while the radix-2 algorithm is available for FFT calculation for all FFT lengths of 2n. For example, for an FFT length 256 which is 28 (256=28), both the radix-2 algorithm and the radix-4 algorithm can perform FFT calculation. However, for an FFT length 512 which is 29 (512=29), the radix-4 algorithm cannot perform FFT calculation while the radix-2 algorithm can perform FFT calculation. Therefore, in order to perform FFT calculation for all FFT lengths of 2n, a mixed-radix algorithm that uses the radix-4 algorithm together with another radix algorithm is required. The last column of Table 1 shows the number of butterfly calculations when a mixed-radix algorithm is used which mixedly uses the radix-4 algorithm and the radix-2 algorithm. The number of calculations performed by the mixed-radix algorithm of Table 1 is equal to the number of calculations performed by the FFT processor provided by Amphion Co. The FFT processor manufactured by Amphion Co. will now be described with reference to FIG. 1.
Referring to FIG. 1, the FFT processor using the mixed-radix algorithm performs mixed-radix calculations of radix-4, radix-8 and radix-16 calculations by selectively operating a radix-4 butterfly and a radix-4/radix-2 butterfly. An input/output interface and controller 11 performs FFT calculation on input data X received from the exterior, and outputs FFT calculation result data Y to the exterior of the FFT processor. The input data X and the output data Y of the input/output interface and controller 11 can become an OFDM symbol or a DMT symbol. A memory controller 12 controls address generation for a memory 13 in order to read and write data in calculation and data for FFT calculation received from the input/output interface and controller 11. The memory 13 is realized with a 1024-word dual port memory, and reads or writes data received from the exterior and intermediate data and result data of FFT calculation in an address designated by the memory controller 12.
A butterfly calculator 10 is comprised of a radix-4 butterfly 14, a rotation factor look-up datable (hereinafter referred to as “LUT”) 16, and a complex multiplier 15. The radix-4 butterfly 14 performs addition and subtraction calculations among radix-4 butterfly calculations. The rotation factor LUT 16 is a memory table for storing a rotation factor of data in calculation and outputting a rotation factor value. The complex multiplier 15 performs complex multiplication among the radix-4 butterfly calculations, and generates the complex multiplication result value. A radix-4/radix-2 selective butterfly 17 selectively performs final calculation according to an FFT length. For example, when radix-2 calculation is required for the final calculation according to an FFT length, the radix-2 butterfly is selected to perform the radix-2 calculation. However, when a radix-4 calculation is required for the final calculation, the radix-4 butterfly is selected to perform the radix-4 calculation. As a result, radix-8 calculation or radix-16 calculation can be performed by connecting the entire FFT calculation with the radix-4 butterfly calculation of the butterfly calculator 10. Therefore, the FFT processor includes a multiplexer (MUX) 18 for selecting the radix-4/radix-2 selective butterfly 17 only in the final stage and selecting the radix-4 butterfly calculator 10 in the other stages. The radix-4 algorithm is realized with a butterfly having 4 inputs and 4 outputs. Therefore, the 4 inputs and 4 outputs must be performed for one cycle in order to minimize the number of calculation cycles. In order to perform the 4 inputs and 4 outputs for one cycle, a memory must be divided into multiple banks. However, the FFT processor of FIG. 1 does not have a multi-bank structure. Therefore, the FFT processor of FIG. 1 requires many calculation cycles, failing to take advantage of the radix-4 calculation.
FIG. 2 is a block diagram illustrating an FFT processor having a mixed-radix algorithm and a multi-bank structure, introduced by Drey Enterprise Co. As illustrated in FIG. 2, the FFT processor introduced by Drey Enterprise Co. also has a memory structure. In the FFT processor of FIG. 2, while one of two input memories (RAMs) 21 and 22 stores input data from the exterior, the other input RAM is used for FFT calculation. A MUX 23 determines whether it will receive a butterfly input from one of the input RAMs 21 and 22, or receive a butterfly input from one of output RAMs 28 and 29. Radix-2 calculators 26 and 27 each perform radix-2 calculation in a radix-2 calculation stage, and generate the radix-2 calculation result. A MUX 24 multiplexes the radix-2 calculation result values received from the radix-2 calculators 26 and 27 in order to write the radix-2 calculation result values in any one of the input RAMs 21 and 22 or any one of the output RAMs 28 and 29. A radix-2/radix-4 common calculator 25 performs radix-4 calculation in a radix-4 calculation stage, and performs radix-2 calculation in a radix-2 calculation stage. While one of the two output RAMs 28 and 29 is used for FFT calculation, the other RAM outputs FFT calculation result data to the exterior. The structure of FIG. 2 uses a mixed-radix algorithm of the radix-4 and radix-2 algorithms, and also uses a multi-bank memory structure. The use of the multi-bank memory structure contributes to minimization of a calculation clock cycle.
However, the structure of FIG. 2 fails to apply an in-place algorithm that writes a butterfly output in a memory location where a butterfly input was accessed. Therefore, the structure of FIG. 2 uses two N-word memories for FFT calculation. That is, for only the FFT calculation, only two four-bank memories are required. However, in order to perform continuous processing, two more four-bank memories must be used for input and output. Therefore, in FIG. 2, a total of 4 memories are used. A memory is one of the blocks that occupies the most area of an FFT processor. Therefore, an increase in number of memories causes an increase in memory complexity, a hardware size and the cost of the FFT processor.
FIG. 3 illustrates a 16-point FFT of an in-place algorithm introduced by L. G. Johnson to minimize memory complexity of the memory structures. The in-place algorithm is used when a memory is divided into multiple banks. For radix-4 butterfly calculation, four data symbols must be simultaneously accessed and four butterfly calculation results must be simultaneously written in the accessed positions. For that purpose, a main memory must be divided into 4 banks i.e. bank #0, bank #1, bank #2 and bank #3, and appropriate addressing must be performed so that several data symbols are not simultaneously accessed from one bank. FIG. 3 illustrates in-place memory addressing for a 16-point FFT, in which there is provided a structure for performing first to eighth butterfly calculations. In each butterfly calculation, 4 inputs are picked at a time. Here, the 4 inputs are read from different banks. A description will now be made of the first and second butterfly calculations. In the first butterfly calculation, 4 inputs are read from an address 0 of a bank #0, an address 1 of a bank #1, an address 2 of a bank #2, and an address 3 of a bank #3, and the butterfly calculation result is written in the same addresses of the same banks. In the second butterfly calculation, 4 inputs are read from an address 0 of a bank #1, an address 1 of a bank #2, an address 2 of a bank #3, and an address 3 of a bank #0, and the butterfly calculation result is written in the same positions. In FIG. 3, a bank index i indicating a bank in use can be simply calculated by performing modulo-4 addition on a value determined by dividing data input count bits by 2-bit digits. Since the FFT of FIG. 3 is a 16-point FFT, a 4-bit counter is used in order to count 16 data bits. The 4 bits are divided into 2 high bits and 2 low bits, and a bank index is calculated in a method of performing modulo-4 addition on the 2 high bits and the 2 low bits.
However, the above-mentioned in-place algorithm has been proposed for a fixed-radix system rather than a mixed-radix system. Therefore, the in-place algorithm cannot be applied to the mixed-radix system without modification.
Next, a description will be made of a conventional continuous processing structure. R. Radhouane has proposed a memory-based FFT processor capable of performing continuous processing with only two N-word memories by simultaneously performing input and output in a memory structure. This structure realizes continuous processing in a method of alternately performing DIF (Decimation in Frequency) calculation and DIT (Decimation in Time) calculation based on the fact that when a radix-2 algorithm performs DIF calculation and DIT calculation, its output and input have a bit reverse characteristic. Shown in Table 2 below is a calculation method of a continuous processing structure using two memories.
TABLE 2Memory #1Memory #2OFDMMemoryFFT-I/OMemoryFFT-I/Osymbolstatemodestatemode0CDIFI/ONAT1I/OBRCDIF2CDITI/OBR3I/ONATCDIT
In Table 2, “OFDM symbol” means data corresponding to a length of FFT calculation. For example, in 256-point FFT calculation, one OFDM symbol means 256 data bits. In Table 2, “C” means FFT calculation, and “I/O” means that input/output is performed. Further, “NAT” means that input/output is performed by performing memory addressing in a correct order of addresses 0, 1, 2, 3, . . . , N−1, and “BR” means that memory input/output is performed by bit reverse addressing. In addition, in a 0th OFDM symbol of Table 2, a memory #1 performs calculation by DIF, while a memory #2 performs input/output by performing NAT, i.e., memory addressing in a correct order. Next, in a 1st OFDM symbol, the memory #1 performs input/output by BR, i.e., bit reverse addressing, while the memory #2 performs calculation by DIF. In a 2nd OFDM symbol, the memory #1 performs calculation by DIT, while the memory #2 performs input/output by BR, i.e., bit reverse addressing. Next, in a 3rd OFDM symbol, the memory #1 performs input/output by NAT, i.e., memory addressing in a correct order, while the memory #2 performs calculation by DIT. From the next 4th OFDM symbol, a series of the calculations on the 0th to 3rd OFDM symbols is repeated. In order to perform continuous processing with two memories, while one memory performs calculation, the other memory must be able to simultaneously perform input and output having a sequential order. In this structure, the continuous processing can be performed with only two memories such that the two memories alternately perform input/output and FFT calculation.
While a conventional structure introduced by Alcatel Co. realizes continuous processing by using three memories, the above conventional continuous processing structure can minimize memory complexity by using only two memories.
However, the above continuous processing structure was designed for only the case where a radix-2 algorithm is used. Since the continuous processing structure performs only radix-2 calculation, it disadvantageously requires many calculation cycles and a high operating frequency.