Error correction coding is widely used to increase the reliability of digital information that has been stored or sent across a transmission channel to a receiver. For example, error correction codes are applied in cases where data is to be transmitted without error when performing mobile communication, FAX or other data communication, and in cases where data is to be reconstructed without error from a large-capacity storage medium such as a magnetic disk or CD. A commonly used technique is to encode data symbols into a sequence of blocks of data prior to transmission or storage, adding redundant symbols to each block to assist in further data recovery. Examples of powerful error-correcting codes are turbo codes described for example in C. Berrou, A. Glavieux, and P. Thitimajshima, “Near Shannon limit error-correcting coding and decoding: Turbo-codes”, Proceedings of the IEEE International Conference on Communications, May 1993, pp. 1064-1070 and U.S. Pat. No. 5,446,747, turbo product codes described for example in J. Hagenauer, E. Offer and L. Papke, “Iterative decoding of binary block and convolutional codes”, IEEE Transactions on Information Theory, Vol. 42, March 1996, pp. 429-445, and low-density parity-check (LDPC) codes described for example in R. G. Gallager, “Low-density parity-check codes”, IEEE Transaction on Information Theory, Vol. 8, January 1962, pp. 21-28; these articles and patents are incorporated herein by reference In many systems, an error correction code (ECC), for example, a turbo code, is supplemented with an error detection code (EDC) such as a cyclic redundancy check (CRC) code that adds additional CRC symbols to each frame prior to the ECC encoding.
FIG. 1 illustrates one common turbo-code encoder that uses two recursive systematic convolutional (RSC) encoders 100 and 104 operating in parallel, with the RSC encoder 104 preceded by an interleaver 102, and a puncturing unit 106. The same data bits di=d(i), i.e. information bits plus possible overhead bits such as the error detection and trellis termination bits, with the index i indicating bit location in a data bit sequence, are fed into two RSC encoders 100, 104, but the interleaver 102 permutes, i.e. re-orders the data bits according to a predetermined interleaving rule before passing the data symbols/bits to the RSC encoder 104. Encoders 100 and 104 generate parity symbols p1(i) and p2(i), which are provided to the puncturing unit 106. The puncturing unit 106 punctures the parity symbols generated by the RSC encoders and, optionally, some of the source data symbols. The source symbols di and corresponding punctured parity symbols p1(i) and p2(i) generated by the encoders 100 and 104 form encoded codewords, which in a data transmission system are provided to a modulator, which is not shown. The turbo code codewords consist of one set of data bits and the two sets of parity bits generated by the two RSC encoders 100, 104. Assuming rate ½ RSC constituent codes, the nominal overall code rate, without any puncturing, is ⅓. The puncturing unit 106 is used to achieve higher code rates by removing some of the data and/or parity symbols from the codewords.
The decoding process for such codes at a receiver or a storage data reader is usually performed in an iterative manner by exchanging soft information, often called extrinsic information, between constituent decoders. Each constituent decoder uses a soft-in/soft-out (SISO) algorithm such as the Bahl, Cocke, Jelinek and Raviv (BCJR) algorithm described in L. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linear codes for minimizing symbol error rate”, IEEE Transaction on Information Theory, Vol. 7, March 1974, pp. 284-287. Versions of this algorithm are also referred to as the maximum a posteriori probability (MAP) algorithm or, in the context of soft iterative decoding, as the a posteriori probability (APP) algorithm. These algorithms and their log-domain variations, often referred to as the log-MAP, log-APP, max-log-MAP and max-log-APP algorithms, are reviewed for example in U.S. Pat. No. 7,203,893, which is incorporated herein by reference, and described in more detail for example in the following articles, which are incorporated herein by reference: P. Robertson, E. Villebrun, and P. Hoeher, “A Comparison of Optimal and Suboptimal MAP Decoding Algorithms Operating in the Log Domain”, Proceedings of the IEEE International Conference on Communications, June 1995, pp. 1009-1013; P. Robertson, P. Hoeher, and E. Villebrun, “Optimal and Sub-Optimal Maximum a Posteriori Algorithms Suitable for Turbo Decoding”, IEEE Communications Theory, Vol. 8, No. 2, pp. 119-125, March-April 1997; and, S. Crozier, K. Gracie and A. Hunt, “Efficient Turbo Decoding Techniques” Proceedings of the 11th International Conference on Wireless Communications (Wireless'99), Calgary, Alberta, Canada, pp. 187-195, July 12-14, 1999.
An APP decoder finds a probability of each data symbol at each symbol time given the entire received signal. Thus it also inherently provides a most likely symbol value at each symbol time given the entire received signal. The max-log-APP algorithm approximates the log-APP algorithm by using a simple “max” operation to replace a more complex operation. This provides a simpler algorithm but also degrades performance. It has been found that most of this degradation can be recovered, typically to within about 0.1 to 0.2 dB depending on the code, by simply scaling back the extrinsic information exchanged between the constituent decoders. This method is sometimes called enhanced max-log-APP decoding and described, for example, in K. Gracie, S. Crozier, and A. Hunt, “Performance of a low-complexity Turbo decoder with a simple early stopping criterion implemented on a SHARC processor,” in Proceedings of the 1999 International Mobile Satellite Conference (IMSC '99), Ottawa, ON, Canada, Jun. 16-19, 1999, pp. 281-286, which is incorporated herein by reference.
Another SISO algorithm is the soft output Viterbi algorithm (SOVA) derived from the original soft-in/hard-out Viterbi algorithm and described, for example, in J. Hagenauer and P. Hoeher, “A Viterbi Algorithm with Soft-Decision Output and its Applications”, IEEE GLOBECOM '89, Dallas, Tex., Vol. 3, paper 47.1, November 1989. The SOVA algorithm was more commonly used in concatenated coding schemes prior to the advent of the iterative turbo decoding. The SOVA algorithm is related to, and has some properties similar to, the max-log-APP algorithm, but is generally considered inferior, partly due to the use of a finite-length history update window. In contrast with the APP decoder that is symbol-based, the SOVA and max-log-APP decoders are sequence-based, i.e. they search for a “most likely” sequence, i.e. the sequence that was most likely transmitted given the received signal.
The error rate performance of an error correction code in a data transmission system with noise is typically characterized by a dependence of an error rate parameter, such as the bit error rate (BER), on the signal-to-noise ratio (SNR). Typically, the BER vs. SNR curve has two important regions, namely the waterfall region and the error flare region. The waterfall region is associated with low to moderate SNR values. In this region, the slope of the error-rate curve drops rapidly as the SNR increases.
The error flare region is associated with moderate to high SNR values. In this region, the error-rate curve suffers from flattening or flaring, making it difficult to further improve the error rate performance without a significant increase in the SNR. In the error flare region, where the error rate performance is mainly determined by the distance properties of the code, the natural way to lower the error flare is to increase the minimum distance dmin of the code and/or reduce the number of codewords (multiplicities) near dmin. A high codeword distance, typically defined as the Hamming distance, is desirable for both lowering the error flare and for making the flare region of the BER vs. SNR curve as steep as possible. Hamming distance is the minimum number of symbols that must be changed in a code word for a first codeword to become a second codeword. The further apart two codewords are, the more a signal can be corrupted while retaining the ability for the decoder to properly decode the message. It is also important to reduce the number of codewords at or near the minimum distance. A practical and common way to improve the distance properties is to use a well-designed interleaver. However, the design of such interleavers is not a simple task and there are theoretical and practical limits to the dmin and multiplicity values, and thus the improvements in flare performance, that can be achieved.
It has been observed that in the flare region the number of information bit errors per packet error that remain after turbo decoding is usually small. Based on this observation, several authors have proposed the serial concatenation of a turbo code and a high rate algebraic outer code to improve the flare performance. The overhead and corresponding reduction in code rate is usually small for large blocks, but can still be quite significant for small or even moderate sizes of a few thousand bits. Costello et al. in an article entitled “Concatenated turbo codes”, IEEE International Symposium on Information Theory and its Applications, September 1996, pp. 571-574, proposed the use of a Reed-Solomon (RS) outer code, whereas Andersen in “Turbo codes extended with outer BCH code”, Electronics Letters, Vol. 32, October 1996, pp. 2059-2060, proposed the use of a Bose, Chaudhuri, and Hocquenghem (BCH) outer code. In contrast to Andersen's method where the BCH outer code protects the entire packet of information bits, Narayanan et al. in an article “Selective Serial Concatenation of Turbo Codes”, IEEE Communications Letters, Vol. 1, September 1997, pp. 136-139, proposed to use the BCH outer code to protect only a few error prone positions in the information packet. These positions are typically the ones associated with the lowest distance codewords. However, the better the interleaver design and/or the more powerful the desired cleanup code, the less effective this method becomes, and the more it must approach that of Andersen's method where the entire data block is protected by the BCH outer code. Additional BCH cleanup results for turbo codes with data puncturing were reported by R. Kerr, K. Gracie, and S. Crozier, in “Performance of a 4-state turbo code with data puncturing and a BCH outer code,” in 23rd Biennial Symposium on Communications, Kingston, Canada, May 29-Jun. 1, 2006, Queen's University.
Motivated by the observation that in the flare region the distance between the estimated codeword and the transmitted codeword is usually close to dmin for most packet errors, especially when random interleavers are used, Öberg introduced a method for lowering the error flare based on distance spectrum analysis, see M. Öberg and P. H. Siegel, “Application of distance spectrum analysis to turbo code performance improvement”, In Proceedings 35th Annual Allerton Conference on Communication, Control, and Computing, September-October 1997, pp. 701-710. Öberg's method identifies positions in the data block associated with the lowest distances, and then a modified turbo code encoder inserts dummy bits in these positions. Consequently, the turbo decoder knows the positions and the values of these dummy bits. The insertion of these dummy bits lowers the code rate, but it also removes the contribution of the lowest distances to the error rate performance. Again, this method is not suitable for most well-designed interleavers, and/or significant cleanup, because too many bits need special protection and the loss in code rate is not acceptable.
Seshadri and Sundberg introduced two list Viterbi algorithm (LVA) decoding methods capable of producing an ordered list of the Z globally best candidate paths through a trellis, as described in N. Seshadri and C.-E. W. Sundberg, “List Viterbi decoding algorithms with applications”, IEEE Transactions on Communications, Vol. 42, February 1994, pp. 313-323. The first algorithm is referred to as the parallel LVA (PLVA) and produces simultaneously the Z best candidate paths. The second algorithm is referred to as the serial LVA (SLVA) and produces the next best candidate path based on the knowledge of the previously found best candidate paths. The LVA approaches are applicable to any concatenated communication system with an outer error detecting code and an inner code that can be represented and decoded using a single trellis. Since the SLVA produces the next best candidate path only if errors are detected in all previous best candidate paths, it tends to have lower average computational complexity than the PLVA. Furthermore, the PLVA requires more memory than the SLVA. Thus, the SLVA is usually recommended. In either case, both LVA approaches require modifications to the original Viterbi algorithm (VA) that increases both its memory and complexity. When applied to an inner code that can be represented by a single trellis, the LVA approach is optimal in the sense that it always finds the Z best candidate paths through the trellis. The complexity is reasonable for small lists but the peak complexity can get very high for large lists.
Narayanan and Stüber applied the LVA to the decoding of turbo codes where a cyclic redundancy check (CRC) code was used as the outer error detecting code and a turbo code was used as the inner code; see K. R. Narayanan and G. L. Stuber, “List decoding of turbo codes”, IEEE Transactions on Communications, Vol. 46, June 1998, pp. 754-762. The turbo code was implemented in the usual manner with two parallel RSC constituent codes and an interleaver to permute the data bits. The turbo decoder was also implemented in the usual manner using SISO iterative decoding. If errors were detected after the turbo decoding was complete then the LVA was invoked. The LVA was applied to one of the two constituent RSC code trellises using the last set of extrinsic information from the other RSC code decoder as a priori information. While this approach does work well, at least for small lists, it is not a globally optimum approach because the LVA can only be applied to one of the constituent RSC code trellises at a time, so that it may not find the Z globally best paths for the entire turbo code structure. Again, the peak complexity of the method can be very high for large lists.
Another method to improve the performance of turbo codes was also disclosed by Narayanan and Stuber in the same article. This method, referred to as the “2k method”, is a simple form of “bit flipping” where all combinations of k weakest bits are checked to see if the EDC can be satisfied. The k weakest bits are determined from the magnitudes of the soft output values. The operating scenario was the same as that used for the LVA method mentioned above. That is, if errors were detected after the turbo decoding was complete then the “2k method” was applied to one of the two data sequences at the output of the two constituent decoders. This 2k bit flipping method was proposed and used earlier for cleaning up conventional convolutional codes decoded using the SOVA algorithm by C. Nill and C. W. Sundberg, “List and soft symbol output Viterbi algorithms: Extensions and comparisons”, IEEE Trans. Commun., Vol. 43, No. 2-4, pp. 277-287, February-April 1995.
This article also teaches using channel interleaving to break up the correlated weak bits so that a simple form of bit flipping and/or LVA processing can be used.
A number of other list-sequence (LS) decoding approaches have been proposed by Leanderson and Sundberg and also applied to the decoding of turbo codes. LS maximum a posteriori probability (LS-MAP) decoding is disclosed by C. F. Leanderson and C.-E. W. Sundberg, in articles “Performance evaluation of list sequence MAP decoding”, IEEE Transactions on Communications, Vol. 53, March 2005, pp. 422-432, and “On List Sequence Turbo Decoding”, IEEE Transactions on Communications, Vol. 53, May 2005, pp. 760-763. This approach basically combines the operations of soft MAP decoding and hard LVA decoding into a single integrated algorithm. The LS-MAP algorithm may be applied during each MAP decoding step of the iterative turbo decoder. Similar optimum and sub-optimum max-log list algorithm (MLLA) methods based on max-log-MAP decoding are presented by the same authors in C. F. Leanderson and C.-E. W. Sundberg, “The max-log list algorithm (MLLA)—A list sequence decoding algorithm that provides soft symbol output”, IEEE Transactions on Communications, Vol. 53, March 2005, pp. 433-444. Again, these approaches have a very high peak complexity for large lists and are not optimal when applied to turbo codes because the LS decoding can only be applied to one of the constituent RSC code trellises at a time.
Yet another approach to improving the error performance of turbo codes, which is referred to as the correction impulse method (CIM) or the forced symbol method (FSM), has been disclosed in Y. Ould-Cheikh-Mouhamedou, S. Crozier, K. Gracie, P. Guinand, and P. Kabal, “A method for lowering Turbo code error flare using correction impulses and repeated decoding,” in 4th International Symposium on Turbo Codes and Related Topics, Munich, Germany, April 3-7, 2006; K. Gracie and S. Crozier, “Improving the performance of 4-state turbo codes with the correction impulse method and data puncturing,” in Proceedings of the 23rd Biennial Symposium on Communications, Kingston, Canada, Queen's University, May 29-Jun. 1, 2006, and Y. Ould-Cheikh-Mouhamedou and S. Crozier, “Improving the error rate performance of turbo codes using the forced symbol method,” IEEE Commun. Letters, pp. 616-618, July 2007. This method uses an EDC, such as a CRC code, and repeated decoding. In this method, the entire iterative decoding process, or a significant portion of it, is repeated with one or more of the weakest data symbols forced to change. This process is repeated using different weak data symbols until the EDC is satisfied or until a maximum number of decodings is reached. The improvement in error rate performance provided by this method is significant, especially in the error flare region, and the average processing is also typically quite low, at least in the error flare region. Similar methods have also been applied to the iterative decoding of LDPC codes and serial-concatenated turbo codes (SCTC). The main drawback of these methods is that the peak processing can be very high due to the repeated iterative decoding, which makes them unsuitable for delay sensitive and/or memory efficient applications due to excessive buffering when the maximum number of decodings is high.
An object of the present invention is to provide a method for improving the performance of sequence-based decoders that overcomes at least some of the deficiencies of the prior art by reducing the error rate in the flare region without significantly raising the decoder complexity, and an apparatus implementing such method.