1. Field of the Invention
The present invention relates generally to a hardware arrangement for computing Fast Fourier Transform and more specifically to such an arrangement via which data stored in a memory can effectively be addressed.
2. Description of the Related Art
A very fast algorithm for computing a Fourier transform, known as the Fast Fourier Transform (FFT), created a revolution in applications for digital signal processing. The FFT itself is very well known in the art of digital signal processing and hence details thereof will not be given for the sake of brevity. Merely by way of example, the detailed explanation of the FFT is given in a book entitled "Handbook of Digital Signal Processing", pages 527-558, edited by Douglas, F. Elliott and published by Academic Press, Inc.
Before discussing the present invention it is deemed advantageous to briefly describe a known addressing technique for computing the FFT with reference to FIGS. 1-3C.
FIG. 1 is a block diagram schematically showing a hardware arrangement for computing FFT, while FIG. 2 is a flow diagram for 8-point DIT (decimation-in-time) FFT. It should be noted that functional blocks which are not directly concerned with the present invention are not shown in FIG. 1 for the sake of simplifying the descriptions.
The arrangement of FIG. 1 includes, a program memory 10, an instruction decoder 12, an address generator 14, an arithmetic unit 16, and a data memory 18. A plurality of instructions, which are stored in the memory 10 for computing the FFT, are successively read out therefrom and decoded by the instruction decoder 12. The data memory 18 stores a plurality of data for computing the FFT and also is arranged to store the results of arithmetic operations.
A first data for computing the FFT is retrieved from the data memory 18 using an address applied from the address generator 14 via an address bus 20. The first data obtained from the memory 18, is applied to the arithmetic unit 16 via a data bus 22. Similarly, a second data is retrieved from the memory 18 and then applied to the arithmetic unit 16. On the other hand, the arithmetic unit 16 is supplied with an arithmetic instruction from the decoder 12, after which it executes the first operation. The result of the computation is applied, via the data bus 22, to the memory 18 and stored therein. These operations are repeated until a sequence of predetermined operations is completed.
Reference is made to FIG. 2 which is the flow diagram for an 8-point DIT FFT which includes three stages of operations. This flow diagram is well known to those skilled in the art.
As shown in FIG. 2, input data x(0)-x(7) which are arranged in the order of x(0), x(4), x(2), x(6), x(1), x(5), x(3), and x(7), undergo the calculations of "addition", "substraction", and "complex multiplication". The final result of these calculations are depicted by X(0)-X(7) at the rightmost side of FIG. 2. In FIG. 2, each of the notations W.sub.8.sup.0, W.sub.8.sup.1, W.sub.8.sup.2, W.sub.8.sup.3 is a complex number called a twiddle factor which is multiplied by the result of the preceding operation (viz., addition or subtraction). The resulting pattern of a pair of crossed arrows is known as an "FFT butterfly".
An important point worth noting about the DIT algorithm (as well as most other FFT algorithms) is that in order to arrange the output sequence (X(0)-X(7)) in natural order, the input sequence should be stored in the required order of x(0), x(4), x(2), x(6), x(1), x(5), x(3), and x(7). The order of the input sequence can be determined in a relatively simple manner of bit-reversal. The definition of bit-reversed order in the case shown in FIG. 2 is as follows:
______________________________________ Address Bit-Reversed (Binary) Data Adresses Data ______________________________________ 000 x(0) 000 x(0) 001 x(1) 100 x(4) 010 x(2) 010 x(2) 011 x(3) 110 x(6) 100 x(4) 001 x(1) 101 x(5) 101 x(5) 110 x(6) 011 x(3) 111 x(7) 111 x(7) ______________________________________
That is, each of the bit-reversed addresses is obtained by exchanging the most significant bit (MSB) of the corresponding bit address with the least significant bit (LSB) thereof.
Accordingly, the input data x(0)-x(7) are respectively stored in the memory 18 (FIG. 1) in the order as shown in the rightmost row. Thereafter, the memory 18 is addressed in the order shown in the second row from the right (viz., Bit-Reversed Addresses).
The pairs of calculations in each of the three stages are different. In general, the distance between each of the pairs at a m-th stage is depicted by 2.sup.m-1 (m is a positive integer). Further, the number of the calculation blocks of each stage is defined by 8/2.sup.3-m+1 .
That is, the first and second stages include four and two calculation blocks, while the third stage includes one calculation block. These blocks are identical in terms of calculating each other.
In order to compute the FFT, it is a current practice to execute the above-mentioned "butterfly" calculations using a pipeline. The pipelined operation includes the following four pipeline stages as shown in FIG. 3:
(1) data retrieval from the memory 18 (depicted by A); PA1 (2) complex multiplication (depicted by B); PA1 (3) butterfly calculation (depicted by C); and PA1 (4) storing of the result in the memory 18 (depicted by D).
According to a known method of calculating the FFT, the address generator 14 should be initialized (depicted by I in FIG. 3) before starting different calculation blocks.
Therefore, the first stage requires 20 time slots while the second stage requires 12 times slots. And, the third stage requires 8 time slots. The total time slots amounts to 40 in this particular case.
It is therefore highly desirable to reduce the number of total time slots by omitting the initialization of the address register 14 before implementing each of the calculation blocks.