Bi-directional digital data transmission systems are continually under development to enable high-speed data communication. A contemporary standard for high-speed data communication is Asymmetric Digital Subscriber Lines (ADSL). Another standard for high-speed data communications is known as Very High Speed Digital Subscriber Lines (VDSL). Like ADSL, VDSL employs Discrete Multi-Tone (DMT) modulation. However, in order to improve the transmission speed of data, the VDSL uses more sub-channels than ADSL.
Meanwhile, the DMT process involves Orthogonal Frequency Division Multiplexing (OFDM). Transmitted data at a transmitter is manipulated according to each sub-channel in the frequency domain and transformed to time domain data for transmission to the various channels. Reverse operations are performed at the receiver. The OFDM is realized through a Discrete Fourier Transform (DFT), which transforms frequency domain signals to time domain signals or time domain signals to frequency domain signals. The VDSL includes sub-channels (4096) more than those (256) of the ADSL. Since the number of sub-channels indicates the length of DFT, the VDSL requires more computations than the ADSL. The DFT principle is as follows.
                                          X            ⁡                          [              k              ]                                =                                    ∑                              n                =                0                                            N                -                1                                      ⁢                                          x                ⁡                                  [                  n                  ]                                            ·                                                W                  N                  nk                                ⁡                                  (                                      n                    ,                                          k                      =                      0                                        ,                    1                    ,                    …                    ⁢                                                                                  ,                                          N                      -                      1                                                        )                                                                    ⁢                                  ⁢                              W            ⁡                          [              k              ]                                =                      ⅇ                                          -                j                            ⁢                                                          ⁢                                                2                  ⁢                  π                                N                            ⁢              k                                                          [                  EQUATION          ⁢                                          ⁢          1                ]            
In the above equation, N time domain data are transformed to N frequency domain data. W[k] is referred to as a coefficient or “twiddle” factor. In computations performed with respect to a given N, a complex multiplication is carried out by N*N and a complex addition is carried out by N*(N−1). The complexity is 0(N*N).
To implement the DFT, Cooley & Tukey proposed the FFT algorithm in the 1960's. Radix-2 FFT and radix-4 FFT represent the proposed FFT algorithm. The radix-2 FFT principle is as follows.
                                          X            ⁡                          [              k              ]                                =                                                    ∑                                  n                  =                  0                                                                      N                    2                                    -                  1                                            ⁢                                                x                  ⁡                                      [                                          2                      ⁢                      n                                        ]                                                  ·                                  W                  N                                      2                    ⁢                    nk                                                                        +                                          ∑                                  n                  =                  0                                                                      N                    2                                    -                  1                                            ⁢                                                x                  ⁡                                      [                                                                  2                        ⁢                        n                                            +                      1                                        ]                                                  ·                                  W                  N                                                            (                                                                        2                          ⁢                          n                                                +                        1                                            )                                        ⁢                    k                                                                                      ⁢                                  ⁢                              W            ⁡                          [              k              ]                                =                      ⅇ                                          -                j                            ⁢                                                          ⁢                                                2                  ⁢                  π                                N                            ⁢              k                                                          [                  EQUATION          ⁢                                          ⁢          2                ]            
As will be seen from the equation 2, a radix-2 FFT processor performs DFT computation that is divided into odd and even portions.
FIG. 1 shows a data flow according to a 16-point radix-2 FFT computation. Referring to FIG. 1, input data x[15:0] is output as output data X[15:0] through processing stages STAGE1-STAGE4. With the radix-2 FFT, a complex multiplication is performed by
      N    2    ×      log    2    N  and a complex addition is performed by N×log2N. Thus, the complexity of the radix-2 FFT becomes 0(N×log2N), while the complexity of the DMT becomes 0(N*N). In case of an example illustrated in FIG. 1, a complex multiplication is performed by 32
  (            16      2        ×          log      2      16        )and a complex addition is performed by 64 (16×log216).
A hardware structure is essentially considered together with selection of the FFT algorithm. Various FFT structures include single-processor, pipeline, parallel-iterative, and array structures, based on arithmetic unit scheme. Selection of such structures may be determined on the basis of computation time, hardware size, and power consumption.
FIG. 2 shows a conventional data transform system. Referring to FIG. 2, a conventional data transform system includes a data converter 10 and a memory 20. The data converter 10 performs FFT and IFFT computations. In the FFT computation, time domain data from a time domain interface is transformed to frequency domain data, and then resultant data is output to a frequency domain interface. In the IFFT computation, frequency domain data from the frequency domain interface is transformed to time domain data, and the resultant data is output to the time domain interface. Computed results are stored in the memory 20.
The data converter 10 is formed of a main controller 11, a coefficient table 12, an arithmetic unit 13, a memory interface 14, and a compressor/expander 15. The coefficient table 12 stores a coefficient WN required for butterfly computation, and the arithmetic unit 13 performs FFT or IFFT computation with respect to input data. The memory interface 14 performs interface operations between the arithmetic unit 13 and the data memory 20, between the compressor/expander 15 and the data memory 20, and between a frequency domain interface and the data memory 20. While in an IFFT mode, the compressor/expander 15 expands complex-type data read out from the data memory 20 to real-type data, and outputs resultant data to a time domain interface. In an FFT mode, the compressor/expander 15 compresses real-type data from the time domain interface into complex-type data and outputs resultant data to the data memory 20 via the memory interface 14.
FIG. 3 shows an exemplary single-processor structure that includes a single arithmetic element and performs serial computation. In FIG. 3, memory 20A stores data prior to computation and memory 20B stores data following computation. The memories 20A and 20B form the data memory 20 illustrated in FIG. 2. Referring to FIG. 3, an arithmetic element AE0 has a hardware scheme according to an adopted computation algorithm. For example, it is assumed that symbol ‘t’ represents the time required for the arithmetic element AE0 of a radix-2 FFT algorithm to perform the butterfly computation in FIG. 1. It will take a time of
      (                  N        2            ×              log        2        N              )    ×  tto perform the radix-2 FFT computation with respect to N data elements.
FIG. 4 shows an exemplary parallel processor structure that has an arithmetic unit 13 composed of k arithmetic elements (k is 0˜N−1) and performs parallel computation. In FIG. 4, memory 20A stores data prior to computation and memory 20B stores data following computation. The memories 20A and 20B form one memory, that is, data memory 20 in FIG. 2. For example, N/2 arithmetic elements are required to process butterfly computations in parallel at any stage in accordance with a radix-2 FFT algorithm. When N=2048, the arithmetic unit 13 necessitates 1024 arithmetic elements. Implementation of a practical hardware having numerous arithmetic elements necessitates several-million to ten-million gates, thus making it difficult to implement the hardware. As a result, a speed of one-stage parallel computation is rapid, but numerous elements are necessitated to process the vast amount of data required by the VDSL process. This results in an increased consumption of valuable circuit area.
Accordingly, there is required a FFT processor that is rapid in speed and occupies a relatively small circuit area.