Turbo code systems employ convolutional codes, which are generated by interleaving data. There are two types of turbo code systems: ones that use parallel concatenated convolutional codes, and ones that use serially concatenated convolutional codes. Data processing systems that employ parallel concatenated convolutional codes decode the codes in several stages. In a first stage, the original data (e.g. sequence of symbols) are processed, and in a second stage the data obtained by permuting the original sequence of symbols is processed, usually using the same process as in the first stage. The data are processed in parallel, requiring that the data be stored in several memories and accessed in parallel for the respective stage.
However, parallel processing often causes conflicts. If two or more elements or sets of data that are required to be accessed in a given cycle are in the same memory, they are not accessible in parallel. Consequently, the problem becomes one of organizing access to the data so that all required data are in different memories and can be simultaneously accessed in each of the processing stages.
Consider a one dimensional array of data, DATA[i]=d_i, where i=0, 1, . . . , NUM−1. Index i is also called a global address. If two interleaver tables, I_0 and I_1, have the same size with N rows and p columns, all indices or global addresses 0, 1, . . . , NUM−1 can be written to each of these tables in some order determined by two permutations. A process of data updating is controlled by a processor, whose commands have the form COM=(TABLE, ROW, OPERATION), where TABLE is I_0 or I_1, ROW is a row number, and OPERATION is a read or write operation.
FIG. 1 illustrates an example of interleaver tables I_0 and I_1. A command COM=(I_0,0, READ) means that row r_0=(25,4,27,41,20) is taken from table I_0, and then data DATA[25], DATA[4], DATA[27], DATA[41], DATA[20] are read from the array DATA. In the case of command COM=(I_1,3, WRITE), the processor takes global addresses from row r_3=(12,37,9,32,36) in table I_1, and writes some updated data d_new_0, d_new_1, d_new_2, d_new_3, d_new_4, into array DATA at these global addresses, that is, the processor updates (writes) data in the array, DATA[12]=d_new_0, DATA[37]=d_new_1, DATA[9]=d_new_2, DATA[32]=d_new_3, DATA[36]=d_new_4.
During the process of turbo decoding the processor performs a sequence of commands over data in the array DATA. The aforementioned Andreev et al. application describes a decomposer for parallel decoding using n single port memories MEM_0, . . . , MEM_(n−1), where n is the smallest power of 2 that is greater than or equal to N and N is the number of rows in tables I_0 and I_1. The Andreev et al. technique creates a table F that represents each memory in a column, such as MEM_0, . . . , MEM_7 shown in FIG. 1, and a global address at each memory address addr in the memory. Two tables G_0 and G_1, which are the same size as tables I_0 and I_1, contain entries in the form (addr, mem) that points to memory MEM_mem and to the address addr related to the memories depicted in table F.
Consider the processor command COM=(I_0, 0, R). Row number 0, R_0=(0,5), (0,0), (0,3), (0,7), (0,4), is taken from table G_0 and the processor simultaneously reads
memory MEM_5 at its address 0,memory MEM_0 at its address 0,memory MEM_3 at its address 0, {open oversize brace} *  {close oversize brace} memory MEM_7 at its address 0,memory MEM_4 at its address 0.
As shown in table F, MEM_5 (sixth column of table F), address 0 (first row), contains the global index 25, MEM_0, addr_0 contains index 4, etc. Table F thus provides a correspondence between global addresses (array indices) and local addresses (memory addresses). Thus, {*} means that the read operation is simultaneously performed with global addresses 25,4,27,41,20, as it should be. {*} also shows that after reading the memories, the global addresses must be matched to the local addresses.
FIG. 2 illustrates a multiplexer 10 that transposes the global addresses or indices from a natural order to a permuted order. Thus, in FIG. 2 the index for the 0 data element (value) of the permutation is read from MEM_5, the index for the 1 value is read from MEM_0, etc. Thus, a selection of required values is output from the memories and a permutation of those values is performed by multiplexer 10.