Turbo codes are a type of forward error correction code with powerful capabilities. These codes are becoming widely used in many applications such as wireless handsets, wireless base stations, hard disk drives, wireless LANs, satellites, and digital television. Turbo codes consist of a concatenation of convolutional codes, connected by an interleaver, with an iterative decoding algorithm. An example of a prior art rate 1/3 parallel-concatenated encoder is shown in FIG. 1. Input data stream 100 (xm) is supplied unmodified to multiplexer 104 at input 106. The two Recursive Systematic Convolutional (RSC) encoders 102 and 103 function in parallel to transform their respective input bit streams. After transformation by RSC encoders 102 and 103, the resulting bit streams are supplied to multiplexer 104 at inputs 107 and 108, respectively. Block 101 is an interleaver (I) which randomly re-arranges the information bits to decorrelate the noise for the decoder. RSC encoders 102 and 103 generate respective p0m and p1m bit streams. Multiplexer 104 reassembles these xm, p0m and p1m bit streams into a resulting output bit stream 105 (x0, p00 and p10 . . . ).
FIG. 2 illustrates a functional block diagram of a prior art turbo decoder 200. Iterative turbo decoder 200 generates soft decisions from a pair of maximum-a-posteriori (MAP) blocks 202 and 203. Each iteration requires the execution of two MAP decodes to generate two sets of extrinsic information. The first MAP decoder 202 uses the non-interleaved data as its input and the second MAP decoder 203 uses the interleaved data from the interleaver block 201 as its input. The MAP decoders 202 and 203 compute the extrinsic information as:                               W          n                =                  log          ⁢                                    Pr              ⁡                              (                                                      x                    n                                    =                                      1                    |                                          R                      1                      n                                                                      )                                                    Pr              ⁡                              (                                                      x                    n                                    =                                      0                    |                                          R                      1                      n                                                                      )                                                                        [        1        ]            where: R1n=(R0, R1, . . . Rn), which are the received symbols. MAP decoders 202 and 203 also compute the a posteriori probabilities:                               Pr          ⁡                      (                                          x                n                            =                              1                |                                  R                  1                  n                                                      )                          =                              1                          Pr              ⁡                              (                                  R                  1                  n                                )                                              ⁢                      ∑                          Pr              ⁡                              (                                                                            x                      n                                        =                    i                                    ,                                                            S                      n                                        =                                          m                      ′                                                        ,                                                            S                                              n                        -                        1                                                              =                    m                                                  )                                                                        [        2        ]            where: Sn is the state at time n in the trellis of the constituent convolutional code.
The terms in the summation can be expressed in the formPr(xn=i, Sn=m′, Sn−1=m)=αn−1(m)γni(m,m′)βn(m′)  [3]where: the quantityγni(m,m′)=Pr(Sn=m′, xn=i, Rn|Sn−1=m)  [4]is called the branch metric, the quantityαn(m′)=Pr(Sn=m′, R1n)  [5]is called the forward (or alpha) state metric, and the quantityβn(m′)=Pr(Rn+1n|Sn=m′)  [6]is called the backward (or beta) state metric.
The branch metric depends upon the systematic, parity, and extrinsic symbols. The extrinsic symbols for each MAP decoder are supplied to the other MAP decoder at inputs 209 and 210. De-interleaver 204 receives the output W1 of MAP decoder 203 and supplies input 209 to MAP decoder 202. Interleaver 205 receives the output W0 of MAP decoder 202 and supplies the input 210 to MAP decoder 203. The alpha and beta state metrics are computed recursively by forward and backward recursions given by:                                                         α              n                        ⁡                          (                              m                ′                            )                                =                                    ∑                                                m                  ′                                ,                i                                      ⁢                                                            α                                      n                    -                    1                                                  ⁡                                  (                  m                  )                                            ⁢                                                γ                  n                  i                                ⁡                                  (                                      m                    ,                                          m                      ′                                                        )                                                                    ⁢                                  ⁢        and                            [        7        ]                                                      β                          n              -              1                                ⁡                      (            m            )                          =                              ∑                                          m                ′                            ,              i                                ⁢                                                    β                n                            ⁡                              (                                  m                  ′                                )                                      ⁢                                          γ                n                i                            ⁡                              (                                  m                  ,                                      m                    ′                                                  )                                                                        [        8        ]            
Adder 206 adds the non-interleaves input data, W0 from MAP decoder 202 and input 209 from de-interleaver 204. The slicer 207 receives the output of adder 206 and completes the re-assembling of the output bit stream 208 (x0, x1 . . . xn−1).
FIG. 3 illustrates a block diagram of a prior art MAP decoder. The subscripts r and f represent the direction, reverse and forward, respectively, of the sequence of the data inputs for the recursive blocks beta and alpha. Input bit streams 310 to 312 are labeled as parameters Xn,r, Pn,r and An,r, respectively. Input bit streams 313 to 315 are labeled as parameters Xn,f, Pn,f and An,f, respectively. The feedback stream from alpha state metric block 302 is labeled αn,f. The feedback stream from beta state metric block 303 is labeled βn,r. Both the alpha state metric block 302 and beta state metric block 303 calculate state metrics. Both start at a known location in the trellis, the zero state. The encoder starts the block of n information bits (for example, n=5114, the frame size) at the zero state and after n cycles through the trellis ends at some unknown state.
Without sliding windows, the frame size of the block would contain n×s×d=327,296 bits. With sliding windows, the processing involves r×s×d=8192 bits where r is 128. Clearly, the memory size requirements are greatly reduced through the use of sliding windows.
A number of tail bits t are appended to the encoder data stream to force the encoder back to the zero state. For a constraint length k code, t=k−1, there are systematic tail bits for each RSC encoder. For an eight state code, k=4, t=3 which is assumed for the remainder of this description. Alpha state metric block 302 will process the received data from 0 to n+2 and beta state metric block 303 will process the data from to n+2 to 0.
The beta state metrics are generated first by beta state metric block 303. These beta metrics are generated in reverse order and stored in the beta state metric RAM 304. Next, the alpha state metrics are generated by alpha state metrics block 303. The alpha state metrics are not stored because extrinsic block 305 uses this data as soon as it is generated.
The beta state metrics are read in a forward order at the same time as the alpha state metrics are generated. Extrinsic block 305 uses both the alpha and beta state metrics in a forward order to generate the extrinsic outputs 306 Wn,i. This implementation requires a large main memory RAM supplying the a-priori inputs 310 to 315. The main memory size is computed as listed in Table 1.
TABLE 1Main Memory SizeNumber of BitsX05120 × 8 = 40,960P05120 × 8 = 40,960P15120 × 8 = 40,960A05120 × 8 = 40,960A15120 × 8 = 40,960I5120 × 13 = 66,560SX176 × 45 × 4 = 31,680P22560 × 8 = 20,480P32560 × 8 = 20,480Totals344,000 bits
The size of the beta state metric memory can also be reduced by using the sliding block implementation. The block of size n is broken into smaller pieces of size r shown in FIG. 4. Each smaller block of size r, called the reliability size, can be processed independently of each other by adding a prolog section of size p to each block of r.
The sliding window block is shown in FIG. 5. The reliability size 501 is r. The prolog size 502 is p and is usually equal to 4 times to 6 times the constraint length. Upon setting all the state metrics to a zero and then executing the prolog, the resulting state metric has a high probability of being in the correct state. This block has a size of r+p. The size of the beta state metric memory will drop to r×8×d. Note that the state metrics for the beta prolog section are not stored.
The turbo decoder controller is required to process an entire frame of data. If the frame is large, then it must be processed into k smaller pieces or sub-blocks as shown in FIG. 6. Each sub-block such as 600 or 601 consists of four sliding windows in this example. Of course, other groupings of sliding windows could have been used.
The beta sub-block must be processed and stored before the alpha and extrinsic (labeled extr in FIG. 6) sub-blocks can start. Therefore, it takes some amount of time units to process k sub-blocks. Each sub-block consists of four sliding windows that are shown in FIG. 7 and FIG. 8. The arrows represent the processing order. RBx is the abbreviation for the reliability section for beta and PBx is the abbreviation for the prolog section for beta. FIG. 8 illustrates the corresponding labels for alpha metrics.
When both beta and alpha sub-blocks are being processed simultaneously, the data memories must be accessed twice. Unfortunately, the addresses are different thus requiring a dual port memory. Other solutions are possible using a single port main memory combined with a combination of scratch memory blocks. Such implementations are hampered by the complexity involved in meeting the required addressing order. The scratch memory would include four separate memory blocks, one for each sliding window. Each of the scratch memory blocks would have 176 addressable locations; the sum of the maximum sizes for reliability and prolog. Each one of the four scratch memory blocks would store the data for one of the four alpha sliding windows.
The difficulty with this solution is that the beta data is written to the scratch memories in a reverse order and the alpha data is read in a forward order. This would require two memories for each sliding window to insure that the data is being processed correctly. During processing of the first sub-block, one of the memories is performing a write. During processing of the second sub-block, the full memory is read from for alpha processing and the other memory is written to for future alpha processing. During the processing of the next sub-block, the operation of the memories is reversed. This technique is call ping-ponging of memories. The memories are ping-ponged until each sub-block has been processed.
A conventional turbo decoder using the dual port main memory approach is illustrated in FIG. 9. Blocks of data to be decoded 900 come from the digital signal processor (DSP) to the main memory 902. Main memory 902 is a dual-port RAM. Memory control block 901 generates both addresses 911 for main memory 902 and addresses 906 the beta RAM 907. Data is passed to the alpha metrics block 904 and the beta metrics block 905 from two separate ports of main memory 902. Beta metrics block 905 writes its output to beta RAM 907 and the alpha metrics block 904 passes its output directly to the extrinsic block 909. Because the output of beta metrics block 905 is used in the order described in FIGS. 6 and 7, a ping-pong beta RAM of a full 8-block size must be used. The multiplexer 908 provides interface between the eight separate portions of beta memory 907 and extrinsic block 909. Extrinsic block 909 completes computation of metric output parameters 910 Wnj.
To avoid loss of processor cycles, the conventional turbo decoder system of FIG. 9 requires a dual port main memory 902 having an array size almost double the size of a single port memory. It also requires an eight-block beta memory 907 because of the order in which beta metrics output is used in comparison to the order in which the alpha metrics output is used in computing output extrinsic data 910 Wnj.