1. Field of the Invention
The present invention relates to digital signal processing, and more particularly to the mapping of a convolution encoder and a Viterbi decoder onto a dynamically re-configurable two-dimensional single instruction multiple data (SIMD) processor array architecture.
2. Description of Related Art
The field of digital signal processing (DSP) has grown dramatically in recent years and has quickly become a key component in many consumer, communications, medical, and industrial products. DSP technology involves the analyzing and processing of digital data in the form of sequences of ones and zeros. In the field of communications, analog signals are converted to such digital sequences for processing and transmission. During transmission, however, these digital sequences may be easily distorted by noise. In order to address this problem, digital data is often encoded before transmission. One form of encoding, known as convolution encoding, is widely used in digital communication and signal processing to protect transmitted data against noise, and its efficiency is well known in terms of error correction quality. In general, convolution encoding is a coding scheme that associates at least one encoded data element with each source data element to be encoded, this encoded data element being obtained by the modulo-two summation of this source data element with at least one of the previous source data elements. Thus, each encoded symbol is a linear combination of the source data element to be encoded and the previous source data elements.
In FIG. 1A, a schematic diagram of a standard convolution encoder with a code rate of one half is shown. For this type of encoder, two encoding outputs, a(t) and b(t), are transmitted for every input u(t). The encoder is shown to be comprised of two delay elements, 10 and 12, and three exclusive-OR Boolean operators 20, 22, and 24. As illustrated, an input u(t) is connected to a first delay element 10, a first exclusive-OR operator 20, and a second exclusive-OR operator 22. The output u(txe2x88x921) of the first delay element 10 is connected to the input of the second delay element 12 and to the second exclusive-OR operator 22. The output u(txe2x88x922) of the second delay element 20 is then connected to the first exclusive-OR operator 20 and to the third exclusive-OR operator 24. The encoding outputs, a(t) and b(t), are then respectively taken from the outputs of the first exclusive-OR operator 20 and the third exclusive-OR operator 24. It should be appreciated that there are four possible binary states of the encoder (u(txe2x88x921), u(txe2x88x922)), including state zero (00), state one (01), state two (10), and state three (11).
The encoding process of the described encoder may also be characterized by the finite state machine illustrated in FIG. 1B. In this diagram, each circle is labeled with a binary representation of one of the four binary states of the encoder. In particular, this diagram provides binary representations for state zero 40, state one 44, state two 42, and state three 46. This diagram is further comprised of several arrows representing the respective transition paths taken into each particular state. In this example, a total of eight transition paths 30, 31, 32, 33, 34, 35, 36, and 37 are illustrated. Each transition path also includes an input/output pair (u(t)/a(t), b(t)) uniquely identifying the conditions needed for that particular transition to occur.
For example, beginning at state zero 40, there are two possible transition paths, including path 30 and path 31. Path 30 depicts an input u(t) of zero that produces respective outputs a(t), b(t) of zero, zero (0/00), thereby causing the finite state machine to remain at state zero 40 (or 00). Path 31 depicts an input u(t) of one and respective outputs a(t), b(t) of one, one (1/11), thereby causing the finite state machine to transition to state two 42 (or 10). From state two 42, there are two possible transition paths, including path 32 and path 37. Path 32 depicts an input u(t) of one that produces respective outputs a(t), b(t) of one, zero (1/10), thereby causing the finite state machine to transition to state three 46 (or 11). Path 37 depicts an input u(t) of zero and respective outputs a(t), b(t) of zero, one (0/01), thereby causing the finite state machine to transition to state one 44 (or 01). The remaining transition paths follow in like manner.
In order to depict how the described encoder evolves over time, a trellis diagram is shown in FIG. 1 C. As illustrated, this diagram is comprised of several nodes (denoted by dots) and transition paths (denoted by solid lines). Each column of nodes represents all states at a particular instant. In this particular example, five instants are described (corresponding to t=1 through t=5). Therefore, this trellis diagram can be regarded as illustrating the sequence of all possible state transition paths over five instants (where it is assumed that the initial state is state zero 40). As a result, any given stream of input bits u(t) can be uniquely determined directly from its corresponding sequence of outputs, a(t) and b(t), and information derived from the encoder""s trellis diagram. For example, if after four instants the observed noiseless outputs {a1(t)/b1(t), a2(t)/b2(t), a3(t)/b3(t), a4(t)/b4(t)} at a receiver are {11, 10, 10, 00}, then the corresponding input sequence {u1(t), u2(t), u3(t), u4(t)} is {1, 1, 0, 1} according to the trellis diagram shown in FIG. 1C. In this example, it should be clear that the number of decoded input bits is determined directly from the number of instants traced back in a given trellis diagram. In practice, two trace-back approaches are used. In the first approach, the number of instants traced back in a trellis diagram is equal to the total number of bits in the entire bit stream (resulting in the decoding of the entire bit stream at once). In the second approach, a pre-determined number of instants is used resulting in the decoding of partial bit streams at a time.
In general, noise will occur during transmission. For example, if the observed output sequence is {10, 10, 10, 00}, the corresponding input sequence is unclear. Thus in practical applications, statistical decoding methods that account for such noise must be implemented. It should be noted that although each transition path 30, 31, 32, 33, 34, 35, 36, and 37 described in FIG. 1B is included in the trellis diagram of FIG. 1C, for simplicity, only transition paths 30 and 31 are labeled.
In the presence of noise, the most commonly used approach to decode convolution codes is via the Viterbi algorithm. In particular, the Viterbi algorithm gives a binary estimation of each input u(t) coded at transmission. This estimation is determined by finding the most likely transition path of a given trellis with respect to the noisy output data (X(t), Y(t)) received by a decoder respectively corresponding to the originally encoded output data (a(t), b(t)). Each node of the trellis used during decoding contains an information element on the survivor path of the two possible paths ending at that particular node. The basic principle of the Viterbi algorithm consists in considering, at each node, only the most probable path as to enable easy trace-back operations on the trellis and hence to determine an a posteriori estimation of the value received several reception instants earlier.
The Viterbi algorithm involves the execution of a particular set of operations. First, a computation is made of the distances, also called branch metrics, between the received noisy output data (X(t), Y(t)) and the symbols (a(t), b(t)) corresponding to the required noiseless outputs of a particular state transition path. In particular these branch metric units are defined as:
Branch(as, bs)=asXk+bsYk
where (as, bs) represent the required noiseless outputs of a particular state transition path and (Xk, Yk) represent a received noisy output received at time k (it should be noted that, in the modulation scheme described herein, zero logic values are replaced by negative ones in the right-side of the above formula). For example, suppose a set of incoming data is defined as (X0, Y0), which corresponds to a particular output (a0, b0) of an encoder for a certain input u0 with a code rate of one half. If the trellis shown in FIG. 1C is used (where it is assumed that state zero 40 is the initial state), then the procedure begins by calculating branch metric units for state transition paths 30 and 31 which respectively correspond to the transition from state zero 40 to state zero 40 and the transition from state zero 40 to state two 42 at the first instant (t=1). In particular, these two transition paths, 30 and 31, would have the following two branch metrics:
Branch (0, 0)=xe2x88x92X0xe2x88x92Y0
Branch (1, 1)=X0+Y0
where Branch (0, 0) describes the branch metric needed to transition from state zero 40 to state zero 40 (where as=0 and bs=0), and Branch (1, 1) describes the branch metric needed to transition from state zero 40 to state two 42 (where as=1 and bs=1). A cumulative branch metric is then determined at each node after each instant. In particular, a cumulative branch metric P(s, t) is defined for each node where s represents the state of the node and t represents the instant as:
P(j, t)=P(i, txe2x88x921)+Branchij
where P(j, t) represents the cumulative branch metric of state j at instant t, P(i, txe2x88x921) represents the cumulative branch metric of a state i preceding state j at instant (txe2x88x921), and Branchij represents the branch metric needed to transition from state i to state j. The most likely path M(j, t) coming into state j at time instant t is then defined as:
M(j, t)=max{i*}[Mi*(txe2x88x921)+Branchi*j]
where {i*} represents the set of states having transitions into state j. It should be noted that the above formula is only needed when there are two possible state transition paths into a particular node (otherwise, the most likely path into state j M(j, t) is simply P(j, t)). In the current example, it should thus be clear that this calculation is not needed until the fourth instant (t=4). It should also be noted that, in the current example, it is assumed that all cumulative branch metrics are initially zero. Therefore, P(0, 1) and M(0, 1) are both initialized to zero at the first instant (t=1).
In the next instant (t=2), four branch metric calculations are needed. Namely, the following branches are needed:
Branch (0, 0)=xe2x88x92X0xe2x88x92Y0
Branch (0, 1)=xe2x88x92X0+Y0
Branch (1, 0)=X0xe2x88x92Y0
Branch (1, 1)=X0+Y0
The cumulative branch metrics corresponding to the two possible paths for each state are then compared in order to determine the paths most likely taken at this particular instant. The selected paths and the cumulative branch metrics of each state are then both stored in memory until the next instant.
After a pre-determined number of instants, a trace-back operation is made in order to determine the optimal cumulative path taken. In particular, the path with the largest cumulative path metric is chosen as the optimal path (although some implementations use the smallest cumulative path metric). This optimal path is then used to decode the original coded bit stream of information according the procedure described earlier for noiseless conditions.
The Viterbi algorithm has been implemented in the prior art using either hardware or software systems. Software implementations of the Viterbi algorithm adapted to run on general purpose digital signal processors have the advantage of better flexibility than hardware implementations, since the software can be readily reprogrammed. Conversely, hardware implementations of the Viterbi algorithm using application specific integrated circuits (ASICs) can achieve higher performance than the software implementations in terms of lower power consumption, higher decoding rates, etc., but cannot be easily modified.
It would therefore be advantageous to develop a method and apparatus for convolution encoding and Viterbi decoding that addresses these limitations of known hardware and software implementations. More specifically, it would be advantageous to develop a method and apparatus for convolution encoding and Viterbi decoding that has the flexibility of the software implementations, with the superior performance of the hardware implementations.
A method and apparatus for convolution encoding and Viterbi decoding utilizes a flexible, digital signal processing architecture that comprises a core processor and a plurality of re-configurable processing elements arranged in a two-dimensional array. The present invention therefore enables the convolution encoding and Viterbi decoding functions to be mapped onto this flexible architecture, thereby overcoming the disadvantages of conventional hardware and software solutions.
In an embodiment of the invention, the core processor is operable to configure the re-configurable processing elements to perform data encoding and data decoding functions. A received data input is encoded by configuring one of the re-configurable processing elements to emulate a convolution encoding algorithm and applying the received data input to the convolution encoding algorithm. A received encoded data input is decoded by configuring the plurality of re-configurable processing elements to emulate a Viterbi decoding algorithm wherein the plurality of re-configurable processing elements is configured to accommodate every data state of the convolution encoding algorithm. The core processor initializes the re-configurable processing elements by assigning register values to registers that define parameters such as constraint length and code rate for the convolution encoding algorithm.
More particularly, the encoding function further comprises generating a multiple output sequence corresponding to the received data input. Essentially, the encoding function comprises performing a modulo-two addition of selected taps of a serially time-delayed sequence of the received data input. The decoding function further comprises mapping a trellis diagram onto the plurality of re-configurable processing elements. The re-configurable processing elements calculate cumulative branch metric units for each node of the trellis diagram, and the core processor selects a most probable state transition path of the trellis diagram based on the branch metric units.
A more complete understanding of the method and apparatus for convolution encoding and Viterbi decoding will be afforded to those skilled in the art, as well as a realization of additional advantages and objects thereof, by a consideration of the following detailed description of the preferred embodiment. Reference will be made to the appended sheets of drawings which will first be described briefly.