Conventional multi processing architectures can process vectors in parallel. Such architectures include vector processors, accelerators and DSPs. Implementing a turbo decoder (TD) is a demanding task. Turbo decoders use a high frequency design, high parallel architectures, special address generation units (AGUs) and/or special memory designs. Turbo Decoders are often implemented using custom hardware.
The up-link (UL) LTE-advance high bit-rate is enabled by, among other things, the highly parallelism architecture, which is needed to implement a LTE turbo decoder. The high parallelism has become possible due to the usage of QPP-interleavers. A proper design enables a QPP-interleaver to access a multibank memory without contentions (there are no memory conflicts, which stall the processing).
The parallel processing architecture along with the parallel access to a multibank contention free memory, enable reducing the decoding time and thus increasing the bit-rate. In the parallel LTE TD design, P processors access concurrently multibank (P-bank) memory to read the P systematic information data (s) and read/write the P a-priori-information (λ) data.
In order to achieve the specified LTE\LTE-advance bit-rate, parallel access to a multibank and contention free memory should be designed and implemented for the LTE TD. Moreover, as the parallelism grows (up to degree of 64 for block-code with size of 6144 bits), the number of AGUs needed for reading and writing s and λ is also grows linearly.
There are TD designs where s is read only in the first ½ iteration. At the end of each ½ iteration the output provided for next ½ iteration is s+λ (where s+λ is used for gamma calculation). Therefore, instead of loading s and λ separately, one AGU is used to load s+λ.
It would be desirable to implement a multi-processing architecture to implement a LTE turbo-decoder (TD).