The present invention relates generally to Viterbi decoding systems, and in particular to a system and method for providing flexible, high-speed, and low-power decoding (based on the Viterbi algorithm) of convolutional codes for wireless and other type communication applications.
Modern society has witnessed a dramatic increase in wireless communications. Wireless technology (e.g., satellite, microwave) has provided a system whereby cellular and other communications have become an ever increasing necessity. In order to satisfy the demand for increased and reliable communications capability, more flexible, powerful, and efficient systems are needed. In particular, forward error correction systems must be improved to satisfy society""s need for increased wireless communications.
Forward error correction systems are a necessary component in many of today""s communications systems. These systems generally add robustness to communications systems by substantially correcting errors that may occur during transmission and reception of wireless data. This is particularly true for systems which are limited in power and/or bandwidth. Often, convolutional coding is a key part in such forward error correction systems. In general, convolutional coding systems introduce redundancy data into a wireless data transmission so that random errors occurring in the transmission have a high probability of being corrected. Consequently, decoding systems (e.g., a Viterbi decoder) must be in place to decode the convolutionally coded data upon reception of the transmitted data, and thereby reconstruct the actual data transmission.
Referring to prior art FIG. 1, a wireless communications system 10 illustrates a particular challenge presented to a conventional wireless system. A transmitter 20 directs a communications signal 24 to a satellite system 30. The satellite system 30, upon receiving the communications signal 24, then directs a communications signal 24a to a ground base station 32 wherein the signal is processed for the intended destination. Anytime during transmission of the communications signal 24 and 24a, noise 34 may corrupt a portion of the transmission (cause an error), thereby causing improper signal reception at the base station 32. If error correction systems were not provided, the signal would likely have to be re-transmitted in order to be properly received at the base station 32. Thus, inefficiencies and increased costs are likely results.
FIG. 2 illustrates a prior art error correction system 40 employing convolutional encoding and Viterbi decoding for increasing the likelihood that transmission signals may be properly communicated despite the presence of noise. Input data 42 (e.g., audio, video, computer data) is input to a convolutional encoder 44. Encoded data is provided as a sequence of data bits 46 (also referred to as encoded symbols), which are composed of actual and redundantly added data, and transmitted over a communications link 48. The communications link 48 may introduce noise into the data transmission and therefore, the transmitted data bits 46 may be corrupted by the time they reach their destination. Each received (and possibly corrupted) data bit 46a may be processed by a Viterbi decoder 50 to provide decoded output data 52. The Viterbi decoder 50, (based upon the Viterbi algorithm which was first proposed by Andrew Viterbi in 1967), provides a decoding system wherein the input data 42 that was originally transmitted may be determined to a high probability even though noise may have affected some of the transmitted (convoluted) data 46. In general, the input data 42 may be determined by computing a most likely sequence for the input data 42 which is derived from the convolutionally encoded data 46a. 
Convolutional encoding is performed by convolving (redundantly adding) input data bits 42 via an encoder with one or more previous input bits 42. An example of a conventional rate xc2xd, constraint length 9, convolutional encoder 44 is shown in prior art FIG. 3. Input bits 42 are input to a series of delay elements 60, such as a shift register 44a, that provides outputs X0 through X8 at various points. The outputs X0 through X8 may be combined by an XOR function 62a and 62b to generate an encoded symbol set G0 and G1. The outputs, X0 through X8, which are connected (tapped) to the XOR function 62a and 62b, will determine an output code sequence of G0 and G1 for a given input data sequence 42. The input to output relationship may be described by a code polynomial for the encoder outputs G0 and G1. For example, for the encoder 44 shown in FIG. 3, the code polynomial is given as:
G0=X0+X1+X3+X6+X8=1+X1+X3+X6+X8; and
G1=X0+X2+X3+X7+X8=1+X2+X3+X7+X8
Note: Texas Instruments Applications Report SPRA071, Viterbi Decoding Techniques in the TMS 320C54x Family, 1996, provides further details on convolutional encoders and code polynomials and is hereby incorporated by reference in its entirety.
As shown, the encoder 44 of FIG. 3, generates the encoded symbol set, G0 and G1, for every input bit 42. Thus, the encoder has a rate of xc2xd (1 input/2 output). The constraint length (K) represents the total span of combinations employed by the encoder which is a function of the number of delay elements 60. A constraint length K=9 implies there are 2(9xe2x88x921)=256 encoder states (the ninth bit is the input bit). These states are represented as state S0 (binary 00000000) to state S255 (binary 11111111).
Convolutionally encoded data may be decoded according to the Viterbi algorithm. The basis of the Viterbi algorithm is to decode convolutionally encoded data by employing knowledge (e.g., mimic the encoder) of the possible encoder 44 output state transitions from one given state to the next based on the dependance of a given data state on past input data 42. The allowable state transitions are typically represented by a trellis diagram (similar to a conventional state diagram) which provides possible state paths for a received data sequence based upon the encoding process of the input data 42. The trellis structure is determined by the overall structure and code polynomial configuration of the convolutional encoder 44 described above. The Viterbi algorithm provides a method for minimizing the number of state paths through the trellis by limiting the paths to those with the highest probability of matching the transmitted encoder 44 output sequence with the received data sequence at the decoder.
FIG. 4 is an illustration of a portion of a trellis 66 and depicts a basic Viterbi algorithm butterfly computation. Four possible encoder transitions 70a through 70d from present state nodes 68a and 68b, to next state nodes 68c and 68d are illustrated. As shown, two transition paths (branches) exist from each present state node 68a and 68b to each next state node 68c and 68d. The Viterbi algorithm provides a process by which the most likely of two possible transition paths may be determined and subsequently selected as a portion of a xe2x80x9csurvivorxe2x80x9d path. For example, branches 70a and 70b provide two possible transition paths to the next state node 68c. Likewise, branches 70c and 70d provide two possible transition paths to the next state node 68d. The transition paths 70a through 70d provide the possible directions to the next most likely states that may be generated by the convolutional encoder 44 as directed by the input bits 42. Once a sequence of survivor paths have been determined (through a plurality of butterfly stages), the most probable data input sequence 42 to the convolutional encoder 44 can be reconstructed, thus decoding the convolutionally encoded data.
The decoder operation generally includes the steps of a branch metric computation, an Add/Compare/Select (ACS) operation, and a traceback operation. The branch metric computation provides a measurement of the likelihood that a given transition path from a present state to a next state is correct. In the branch metric computation, the received data values, typically an 8 or 16 bit digital value representing the magnitude of voltage or current of an input signal, are processed to determine a Euclidean or equivalent distance (see reference noted above for further details) between the received data values and all possible actual data values, uncorrupted by noise, which may result from a state transition from a present state to a next state.
Thus, decoding data signals from a convolutional decoder of rate 1/R with a constraint length of K requires determining a total of 2R branch metric values for each encoded symbol input to the decoder. As described herein, the set of 2R branch metric values is defined as the complete branch metric set for a particular received input symbol.
In the next decoder step, previously computed branch metric values for all possible state transitions are processed to determine an xe2x80x9caccumulated distancexe2x80x9d for each of the paths to the next state. The path with the minimum or maximum distance, depending on the implementation, (i.e., maximum probability) is then selected as the survivor path. This is known as the Add/Compare/Select, or ACS operation. The ACS operation can be broken into two basic operations. An Add operation, or path metric computation, and the Compare/Select operation. The path metric Add operation is the accumulation of present state values (initialized by a user at the start of Viterbi processing and carried forward from state to state) with the branch metric values for a received data input sequence. The Compare-Select operation computes and compares two values from the Add operation to determine the minimum value (or maximum value, depending on the implementation) and stores one or more xe2x80x9ctraceback bitsxe2x80x9d to indicate the selected survivor path.
The final decoding step is the traceback operation. This step traces the maximum likelihood path through the trellis of state transitions, as determined by the first two steps, and reconstructs the most likely path through the trellis to extract the original data input to the encoder 44.
Conventionally, digital signal processors (DSPs) have been employed to handle various Viterbi decoding applications. Many DSPs have special instructions specifically designed for the Viterbi decoding algorithm. For example, many of today""s cellular phone applications involve DSP solutions. However, when a code such as the code described above (K=9) is employed in conjunction with high data rates (384 kbits/secxe2x88x922 Mbits/sec), high computation rates are generally required. This may require 49xc3x97106 to 256xc3x97106 Viterbi ACS operations per second. These computing operations are multiplied even more when multiple voice/data channels are processed by a DSP in a cellular base station, for example. Thus, Viterbi decoding may consume a large portion of the DSPs computational bandwidth. Consequently, higher performance systems are necessary to meet increased computational demands.
Another challenge faced by conventional decoding systems is the need to decode various forms of convolutional codes. Many decoding systems are hard-wired and/or hard-coded to deal with a particular type of convolutional code. For example, the constraint length K, described above, may vary (e.g., K=9,8,7,6,5, etc.) from one encoding system to the next. Also, the code polynomials mentioned above may vary from system to system, even though the constraint length may remain unchanged. A hard-wired and/or hard-coded decoding system may need to be re-designed in order to meet these different encoding requirements. Various other parameters also may need to be varied in the encoding/decoding process as well. Therefore, it would be desirable for a decoding system to provide a high degree of flexibility in processing various forms of encoded data.
Still another challenge faced by conventional decoding systems are increased power requirements. As data is decoded at higher rates, computational demands of the decoding system often times increase the power requirements of the decoders (e.g., DSPs, processing systems). Many conventional systems require extensive register and memory accesses during the decoding process. This generally increases power consumed in decoders and generally lowers decoder performance (e.g., speed, reliability).
In view of the above problems associated with conventional decoding systems, it would therefore be desirable to have a Viterbi decoding system and/or method which provides a high degree of flexibility with increased decoding performance and with lower power requirements.
The present invention is directed toward a VLSI architecture for a Viterbi decoder for wireless or other type applications which may operate within a programmable DSP system, and which provides flexibility, low-power, and high data throughput rates. The architecture is intended to provide a cost effective solution for multiple application areas, including cellular basestations and mobile handsets.
The decoder preferably operates on a plurality of common linear (single shift register) convolutional codes of rate 1/n and constraint length, K=9 (256 states), or less, and is capable of a substantially high throughput rates of 2.5 Mbps in the case of K=9. In particular, high data throughput rates are achieved by a cascaded ACS system which operates over several trellis stages simultaneously. Additionally, the cascaded ACS performs a partial pretraceback operation, over multiple trellis stages, during the ACS operation. This increases system throughput by reducing the complexity of a final traceback operation to retrieve decoded output bits and substantially decreasing the number of memory accesses associated therewith.
The high data throughput rate enables the decoder to handle substantially hundreds of voice channels for next generation cellular basestations. This may greatly reduce the number of DSP processors a system requires and likely lowers system costs of a purely DSP based system. These types of data rates and codes are employed extensively in wireless applications of many varieties from satellite communications to cellular phones.
Since there are variations between particular encoding applications, and within some decoding applications with regard to the exact structure of the Viterbi decoding problem, flexibility in the decoding architecture is provided. In particular, the cascaded ACS system described above may be configured to operate on variable constraint length codes by operating over multiple stages of the trellis for K=9. This is accomplished by operating on a sub-trellis architecture in conjunction with a state metric memory. For the cases of K less than 9, particular ACS stages are bypassed selectively.
The present invention incorporates a high degree of flexibility to enable the decoder to be employed in many variable situations. The decoder flexibility includes variable constraint lengths, user supplied polynomial code coefficients, code rates, and traceback settings such as convergence distance and frame structure.
A DSP interface is provided which is memory mapped to enable high data rate transfers between the decoder of the present invention and a DSP. This greatly reduces the processing burden of the DSP and provides for a more powerful system overall. Significant buffering is also provided within the decoder. The present invention also supports intelligent data transfer and synchronization mechanisms, including various trigger signals such as: execution done, input buffer low, and send/receive block transfer completed.
Additionally, the present invention has been designed to operate at high data rates and to be highly energy efficient, (i.e., low power). Low power operations are accomplished by minimizing register operations and memory accesses, and by paralleling and streamlining particular aspects of the decoding process. For example, the ACS operation described above performs pretraceback operations during the ACS operation. Additionally, memory accesses are reduced by operating over multiple stages of the trellis simultaneously.