Digital communication systems have become vital for supporting modern, high-speed data communications. FIG. 1 is a block diagram of a typical digital communication system. A digital source 110 produces binary messages. A channel encoder 120 uses a forward error-correction coding (ECC) scheme to add some redundancy to the binary messages and transform every binary message into an encoded message called a code block. A modulator 130 transforms the code blocks into signals appropriate for transmission over a channel 140. The signals enter the channel 140 and are corrupted by noise 150. A demodulator 160 receives the signals from the channel 140 and converts it into blocks of symbols. A channel decoder 170 exploits the redundancy introduced by the channel encoder 120 to detect, and often correct, any errors added by the channel and to restore the original binary messages. A digital sink 180 makes use of the binary messages.
Data processing systems using convolutional codes for ECC are theoretically capable of reaching the Shannon limit, a theoretical limit of signal-to-noise for error-free communications. Before the discovery of turbo codes in 1993, Viterbi decoders were used to decode convolutional codes. However, as ECC requirements increased, the complexity of Viterbi decoders exponentially increased. Consequently, a practical limit on systems employing Viterbi decoders to decode convolutional codes was about 3 to 6 dB from the Shannon limit. The introduction of turbo codes allowed the design of practical decoders capable of achieving a performance about 0.7 dB from the Shannon limit, surpassing the performance of Viterbi decoders of similar complexity. Therefore, turbo codes offered significant advantage over prior code techniques. Consequently, turbo codes are extensively used in modern data communication standards, such as 3G, 4G, and IEEE 802.16.
FIG. 2 shows a simplified block diagram of the iterative algorithm for decoding turbo codes. The received code block is divided into 3 parts: (y0, y1, y2). Vectors (y0, y1) are sent to a first MAP decoder 210, which produces a vector L1ex. The vector L1ex is sent to an interleaver 220, which performs some mixing of vector components to yield a vector L2in. The vector L2in and vectors (y0, y2) are sent (u0 via an interleaver 240 in the case of y0) to a second MAP decoder 230, which produces a vector L2ex. The vector L2ex is sent to a deinterleaver 250, which performs a transformation that is an inverse to the transformation performed by the interleaver 220 to yield a vector L1in. The vector L1in and vectors (y0, y1) are sent back to the first MAP decoder 210 to begin another iteration. The iterative process stops after a fixed number of iterations or if one or more stopping criteria are met, yielding a result as shown.
While details of the above-described iterative decoding algorithm are out of scope of this discussion, some general observations about the algorithm may be made:                The code blocks may have different lengths. For example, according to 3GPP standards the source message length may vary from 40 to 6144 bits. The channel decoder should efficiently handle a data flow that consists of code blocks of different lengths.        The total time needed to decode a code block is proportional to the code block length.        The total size of memory employed by the channel decoder is proportional to the maximum length of the code block that the channel decoder is able to decode.        
Modern high speed data communication systems are designed to support data rates about 100 Mbs and above. To support turbo decoding at this speed, conventional channel decoders use several constituent decoding units working in parallel. Therefore, the channel decoder has to distribute the decoding tasks among its constituent decoding units. A channel decoder containing multiple decoding units works as follows:
1. The decoder receives several code blocks.
2. The decoder makes an assignment of the code blocks into the decoding units.
3. The decoding units perform the decoding tasks in parallel.
4. The decoder retrieves the decoding results.
5. This process is repeated for further code blocks.