1. Field of the Invention
The present invention relates to the technical field of signal processing and, more particularly, to a high-speed radix-4 butterfly module and the method of performing Viterbi decoding using the same.
2. Description of Related Art
Convolutional codes in current digital communication systems are widely used to increase the data transmission reliability due to the high error correction capability. A convolutional code other than a block code can increase the error correction capability by increasing the constraint length without wasting the transmission bandwidth.
A Viterbi decoder implemented by a Viterbi algorithm is a convolutional code decoder widely used in current wireless communication systems. The Viterbi decoder searches a trellis diagram for finding a path the closest to the desirably received sequence as a decoding output. FIG. 1 is a block diagram of a typical Viterbi decoder. As shown in FIG. 1, the Viterbi decoder is essentially implemented by three parts: a branch metric unit 10, an add-compare-select (ACS) unit 20 and a traceback unit 30. The branch metric unit 10 computes the branch metric values of each stage, which is the dominant operation in the entire decoder. The add-compare-select circuit 20 computes the path metric values for every path and finds a surviving path. When the length of the surviving path reaches to a traceback depth L, the traceback unit 30 starts a traceback procedure in order to obtain a decoding output through the surviving path selected.
On a Viterbi decoder implementation, the trellis diagram of each stage can be typically divided into multiple radix-2 butterfly units for simplifying the implementation and easily using the symmetric relation between the branches to simplify the branch metric computation. Further, such a way can effectively save the hardware implementation and easily use the parallel processing to speed each stage processing.
FIG. 2 is a schematic diagram of a typical radix-2 butterfly unit. As shown in FIG. 2, a radix-2 butterfly unit includes four states, and each state transition can be expressed by an origin state yx and a destination state xz, where y indicates the bits to be eliminated in the register, z indicates the current input bits, and x indicates the same bits in all states of the radix-2 butterfly unit. In this case, the output word corresponding to the state transition is byxz. Accordingly, the branch symmetry in the radix-2 butterfly unit can be expressed as follows:b0x0= b0x1= b1x0=b1x1.  (1)Namely, upon the symmetry shown in equation (1), the computation for the four branch metric values can be reduced to one.
In accordance with the features, the Viterbi decoder can have a decoding output only after the L-stage or higher butterfly unit is computed. Namely, the decoding output is obtained after an L-stage operation delay. In order to reduce the operation delay required for obtaining the decoding output, the radix-4 butterfly structure is provided to increase the processing speed.
In a radix-4 butterfly structure, each radix-4 butterfly unit is obtained by combining two stages of radix-2 butterfly unit into one. In this case, the delay time can be reduced from two stages of radix-2 butterfly unit to one stage of radix-4 butterfly unit, so as to speed the entire decoding output. FIG. 3 is a schematic diagram of a typical radix-4 butterfly unit. As shown in FIG. 3, the delay time can be reduced from two stages of radix-2 butterfly unit to one stage of radix-4 butterfly unit since the two stages are combined into the one stage. Accordingly, the L-stage operation delay is reduced to an L/2, and the entire decoding output is speeded.
Using the radix-4 butterfly structure in implementation can speed the decoding output of the decoder, but the circuit corresponding to a radix-4 butterfly unit becomes complex and takes more hardware cost. FIG. 4 is a schematic diagram of an add-compare-select (ACS) unit of a typical radix-4 butterfly unit. As shown in FIG. 4, in addition to more branch metric values to be computed, the radix-4 butterfly unit has the increased input number of four on each comparator, which means a higher cost to implement the Viterbi decoder by the radix-4 butterfly structure. Also, the symmetric relation in equation (1) is not available.
In implementation of the Viterbi decoders, the symmetric relation between branches in the trellis diagram is used in the prior art to relatively reduce the computational amount of branch metric values required by the decoder, but such a branch relation is only suitable for a radix-2 trellis diagram, not for a radix-4 trellis diagram obtained by combining two stages of radix-2 butterfly unit.
Therefore, it is desirable to provide an improved radix-4 butterfly structure to mitigate and/or obviate the aforementioned problems.