1. Field of the Invention
The invention relates to a Fast Fourier Transform (FFT) device and, more particularly, to a single-port RAM-based FFT device.
2. Description of Related Art
FFT is a demodulating method commonly used in high-speed communication systems, and its corresponding modulating method is typically known as Inverse Fast Fourier Transform (IFFT). FFT is evolved from Discrete Fourier Transform (DFT). An N-point DFT can be represented by the following equation:
                                          X            ⁡                          (              k              )                                =                                    ∑                              n                =                0                                            N                -                1                                      ⁢                                          x                ⁡                                  (                  n                  )                                            ⁢                              W                N                nk                                                    ,                                  ⁢                              for            ⁢                                                  ⁢            k                    =          0                ,        1        ,        …        ⁢                                  ,                  N          -          1                ,                                  ⁢                              W            N            nk                    =                                    ⅇ                                                -                  j                                ⁢                                                      2                    ⁢                    π                    ⁢                                                                                  ⁢                    nk                                    N                                                      .                                              (        1        )            
However, equation (1) obviously implies the problem of high computational complexity to thus be replaced by FFT, which has a merit of lower computational complexity. FFT is widely used in high-speed communication systems. Here, a wide marketing broadband access technology, i.e., Asymmetric Digital Subscriber Line (ADSL), is given as a description of a high-speed communication system. The ADSL technology adopts a Discrete Multi-Tone (DMT) method to perform data modulation/demodulation. The DMT method traditionally divides a communication band into multiple orthogonal sub-channels. A downstream/upstream bandwidth is determined according to the communication quality of each sub-channel. DMT provides the good capability of adaptive data transmission so as to provide a better efficiency on the communication band. DMT adopts IFFT/FFT to perform data modulation/demodulation.
A traditional circuit structure adapted to perform a traditional FFT is similar to that for a traditional IFFT. In this case, the traditional IFFT circuit can be obtained by inversely arranging the traditional FFT circuit appropriately. Since the traditional IFFT circuit is easily obtained by one skilled in the art, the following description focuses on the traditional FFT circuit only.
Various methods have been used to implement the traditional FFT circuit. For example, FIG. 1 shows a block diagram of a memory-based FFT device 10. The device 10 essentially includes a traditional memory 14 and processor 18. The data processing is completed by using a traditional address controller 20 to generate address of memory 14, to control the functions of the barrel shifters 12 and 16 according to a sequence value q generated by a sequence value generator 22, and to further control original data to be processed and then output result data. The original data is preferably a complex word containing a real number and an imaginary number.
To simplify the circuit complexity of the traditional FFT device 10, a recursive structure is traditionally used and therefore just one processor 18 needs to be adopted to repeatedly perform the data processing. As a result, the circuit area of the traditional FFT device 10 can be relatively reduced. In addition, the number of data input and output ports of the processor 18 is traditionally a power of 2, i.e., 2, 4, 8 . . . etc., denoted by r. The whole profile of the processor 18 looks like a butterfly and therefore is named the butterfly structure. Such a memory-based structure can enhance the flexibility of the memory 14 because the memory 14 concurrently plays two roles, one is a data buffer as data is input or output, and the other is a data register while the FFT device 10 is in computation status. Because the memory 14 is a RAM and can be accessed randomly, a user can appropriately design an addressing controller 20 to write the data to an appropriate address. Similarly, when the data is output, the addressing controller 20 can sequentially output the data stored in the memory 14.
To minimize the area of the memory 14 and increase the efficiency, the FFT device 10 adapts an “in-place conflict-free” addressing to extend the utility of the memory 14 to 100%. The term “in-place” indicates that data before and after being processed are stored in a same memory address. Accordingly, the capacity of the memory 14 is reduced to the minimum. The term “conflict-free” indicates that when the processor 18 accesses data in the memory 14, one bank of the memory 14 will not be asked to provide two or more data at each time.
Because of using the dual-port RAM devices to form the memory 14, data can be read from and written to the memory 14 concurrently. In addition, the memory 14 can operate with a data shifting function of the barrel shifter 12 to appropriately shift the sequence of data. For example, the sequence of data output from the processor 18 can be shifted by one word so as to be written to the memory 14 correctly. The serial number of data can be assigned by the user, and is preferably sorted by natural order. Based on the foregoing explanation, the abovementioned shifting can provide a similar function to sort the data in natural order.
According to the equation of the in-place conflict-free addressing, an index n of data to be processed is represented by the following equation:n=n0·rR−1+n1·rR−2+ . . . +nR−1·r0,  (2)
A bank index B(n) of the memory 14 is represented by the following equation:B(n)=(n0+n1+ . . . +nR−1)mod r,  (3)wherein R is represented by the following equation:R=logrN,  (4)In addition, an address value A(n) of a cell of a memory bank is represented by the following equation.A(n)=n1·r0+n2·r1+ . . . +nR−1·rR−2,  (5)
For example, if N=64, r=4, R=3, the 41th data has the index n=(221)4 at the input terminal, the bank index B(n)=1, and the address value A(n)=6. Accordingly, when the bank index B(n) and the address value A(n) are known, the data can be correctly read from the memory 14 to the input ports of the processor 18 to perform the FFT process, and then the processor 18 can write the processed data to the same memory address in the same memory bank. As shown in FIG. 2, if the amount of data is 64 (N=64), and sorted in natural order, because a dual-port RAM has a feature of random access, the data can be stored in the memory bank randomly upon the principle of the in-place conflict-free addressing. In addition, the processor 18 has four data input ports (n=r=4), and divides the memory 14 into four memory banks, denoted as bank0, bank1, bank2 and bank3. The cells of each bank can output or input data before or after being processed. No matter before or after being processed, the data is stored in the same memory address. Further, no conflict occurs in concurrently storing and fetching r data as a result of using the in-place conflict-free addressing. In the figure, a circle pattern (O) indicates a processor 18, and the number near the circle pattern indicates the sequence of data processing. The 48 times of data processing are divided into three stages (R=3), stage0, stage1 and stage2. Each stage performs 16 times of data processing in a random sequence, but on the purpose of simplifying the design complexity of the address controller 20, a natural sequence is preferable. Next, the address controller 20 outputs a respective address to the memory 14, and outputs read and write shift amounts to the barrel shifters 12 and 16 respectively so as to control the operations thereof. Thus, the shifter 12 and 16 can correctly provide data from the correct memory bank or write data to the correct memory bank. The relation between the sequence value q and the shift amount is shown in the following equations.q=qR−2·rR−2+qR−3·rR−3+ . . . +q0,  (6)Shift Amount=(qR−2+qR−3+ . . . +q0)mod r.  (7)
As shown in FIG. 3, the read shift amount is the same as the write shift amount, and difference is that the write shift amount is delayed m clocks from the read shift amount, wherein the m clocks preferably equal to the time for performing one data processing. For simplifying the following description, the address value A(n) of the bank index B(n) is denoted as Bn[A(n)]. And, in this case, m=4. When q=0, the butterfly structure reads data at memory addresses B0[0], B1[0], B2[0] and B3[0]. Next, when q=1, the butterfly structure reads data at memory addresses B0[1], B1[1], B2[1] and B3[1] that are shifted by one complex word. Next, when q=2, the butterfly structure reads data at memory addresses B0[2], B1[2], B2[2] and B3[2] that are shifted by two complex words. Next, when q=3, the butterfly structure reads data at memory addresses B0[3], B1[3], B2[3] and B3[3] that are shifted by three complex words. Next, when q=4, the butterfly structure starts to write the processed data back to the memory 14. Accordingly, the time required by the FFT device 10 to complete the whole FFT is represented by the following equation:
                              (                                                    N                r                            ·                              log                r                                      ⁢            N                    )                +                  m          .                                    (        8        )            
However, due to the dual-port RAM, the circuit area of the FFT device 10 is still large and needs to be further reduced to meet with the miniaturization requirement. Therefore, it is desirable to provide an improved device to mitigate and/or obviate the aforementioned problems.