Turbo codes are a powerful technique for reducing the effects of errors in a noisy communication channel. A turbo encoder at a transmitter inserts redundancy into a communicated signal, and a turbo decoder at a receiver uses the redundancy to correct transmission errors. Turbo decoding is, however, computationally complex. In some communication systems, the computational complexity of the turbo decoder sets an upper limit on the data rate that can be communicated through the communication system. Hence, techniques to improve the speed of turbo decoders are highly desirable.
One approach to increasing the speed of a turbo decoder is to perform parallel decoding. In parallel decoding, a block of data to be decoded is broken into a number of sub-blocks, and a number of sub-decoders operate simultaneously, each decoder decoding a different sub-block of data. By using N sub-decoders to decode each block, the time required to decode a block can be reduced by approximately a factor of N.
Parallel decoding presents a number of difficult challenges. To appreciate the difficulties of parallel decoding, the general encoding and decoding process will first be discussed. Usually, data is turbo coded by using two component encoders and an interleaver. The component encoders convert input data into output data which has redundancy added (and thus, for a given number of k input bits, n encoded bits are produced, where, generally, n>k). For example, a component encoder may be a block or convolutional code. The data encoded by one component encoder is interleaved with respect to the data encoded by the other component encoder, as discussed further below. Interleaving consists of reordering the data in a predefined pattern known to both the transmitter and receiver.
Both parallel and serial turbo code configurations are known in the art. In a parallel configuration, the component encoders operate on the data simultaneously, with a first component encoder operating on the data in sequential order and a second component encoder operating on the data in interleaver order. The outputs of the first and second component encoders are then combined for transmission. In a serial configuration, a first “outer” component encoder operates on the data, the encoded data is interleaved, and a second “inner” component encoder operates on the interleaved encoded data. In either configuration, some of the encoded bits may be deleted (punctured) prior to transmission.
Decoding turbo encoded data is typically performed using an iterative process. Two decoders repeatedly decode the data, each decoder corresponding to one of the component encoders. After the first decoder performs its decoding, the data is deinterleaved and then passed to the second decoder which performs its decoding. The data is then interleaved and passed back to the first decoder and the process repeated. With each iteration, more transmission errors are typically corrected. The number of iterations required to achieve most of the potential error correction capacity of the code depends on the particular code used.
A turbo decoder can be implemented by using a single decoder engine, which reads and writes results into a memory for each iteration. This memory is sometimes called an interleaver memory, and may be implemented as a ping-pong memory, where previous results are read from one part while new results are written to the other part. The write and read addressing is usually different, corresponding to interleaved or de-interleaved data ordering. For example, data is read (written) in sequential order by sequentially incrementing the address into the interleaver memory. Data is read (written) in interleaved order by using permuted addresses. For example, interleaving can be performed by writing results into the memory at a series of discontinuous permuted addresses corresponding to the interleaver pattern. The data can then be read from interleaver memory using sequential addresses to result in interleaved data for the next decoding iteration. Similarly, deinterleaving can be performed by writing results into the memory at a second series of discontinuous depermuted addresses that have the effect of placing interleaved data back into sequential order, available for sequential addressing for the next decoding iteration. Interleaving and deinterleaving can also be performed during the read operations, using sequential, permuted, or depermuted addresses, in which case writes can be performed using sequential addresses.
Turning to parallel decoding, each sub-decoder accesses the interleaver memory to read/write data. Hence, the interleaver memory is shared among the sub-decoders; the interleaver memory can therefore present a bottleneck in the performance of a parallel decoder. One approach to reduce this bottleneck is to implement the shared interleaver memory as a multiport memory, allowing multiple reads or writes each clock cycle. Multiport memories tend to be more expensive and complex than single port memories.
Another approach to reduce this bottleneck is to break the interleaver memory into banks, one bank for each sub-block/subdecoder, where the sub-decoders can access their corresponding banks simultaneously. This approach is very appropriate for field programmable gate array (FPGA) based decoder designs, as FPGAs have on-chip memory organized into banks, often called block RAMs. This approach, however, can fail for some interleavers. For example, interleaving is typically implemented across the entire block of data, and thus when reading or writing in permuted or depermuted address order, it may be necessary for a sub-decoder to access a bank corresponding to a different sub-decoder. When two subdecoders attempt to access the same bank at the same time a conflict results.
One prior solution involves reorganizing the interleaver structure as a conflict free interleaver to ensure that contention into the same bank is avoided. Conflict free interleavers, however, result in regularities in the interleaving structure that can result in reduced coding gain and susceptibility to jamming. Conflict free interleavers can be defined for particular degrees of parallelism, but may not be conflict free for all potentially useful degrees of parallelism. Additionally, many communications systems must comply with predefined interleaver structures which are not conflict free.