Channel coding is widely used to increase the reliability of digital information that has been stored or sent across a transmission channel to a receiver. In digital communications, a commonly used technique is to encode data symbols into a number of messages in block format prior to transmission, adding redundant symbols to each block to assist in further data recovery.
If each code block has n symbols of which k symbols are the original data and (n−k) symbols are the redundant parity symbols, the code is called a block code and is characterized by a duplet (n, k). A valid sequence of n symbols for a block code (n,k) is called a code word, and n and k are hereafter referred to as respectively a length and dimension of the block code. Since there can be many more possible combinations of n symbols in a block of length n than possible datasets of length k, not all combinations of n symbols can be a valid code word, which assists in decoding.
A block code (n,k) is called a linear block code if the sum of each two code words also is a code word. For binary codes, binary addition is assumed to be an exclusive ‘OR’ (XOR) operation. A parity check matrix, P, of a linear block code (n,k) is any (n−k)×n matrix of rank (n−k) for whichvPT=0  (1a)
for any code word of the linear block code (n, k).
The matrix equation (1a) is equivalent to (n−k) row equations corresponding to (n−k) rows of matrix P; these equations hereafter are referred to as parity equations. A symbol location is referred to hereafter as being present in a row of matrix P, if it is present in the corresponding parity equation. A systematic parity check matrix can be represented in a formP=[IP′], 
where I denotes the (n−k)×(n−k) identity matrix.
At a receiver, a block decoder is used to estimate the original message based on the received data samples. An input information vector y of length n received by a decoder is said to be related to a code word v of a linear block code (n,k) if it represents the code word v received after a transmission through a noisy channel. The information vector y is also referred to hereafter as a soft information vector, and its elements are referred to as soft values related to code word symbols, or received samples.
A hard decision is said to be taken on an element of a soft information vector if the element is assigned a value of a nearest symbol. A hard decision vector d related to a soft information vector y is a vector comprised of code symbols in accordance with a certain rule so to approximate the code word v to which vector y is related.
Known decoding approaches can be divided in two categories in accordance with how they utilize an incoming analogue information stream: these are a hard-decision decoding and a soft decision decoding. Hard-decision decoders start with input information in a digitized form of code symbols, or “hard decisions”, and use decoding algorithms to attempt to correct any errors that have occurred. Soft-decision decoding (SDD) on the other hand utilizes additional information present in the received data stream. SDD starts with soft decision data that may include hard information indicating which value each received symbol is assigned (e.g. a “1” or a “0” for binary symbols) and an associated value that indicates a reliability or confidence that the value assigned to a particular received symbol is correct. This is generally referred to as “soft input” information. A decoder then utilizes the soft input information to decode the received information so as to produce a code word most likely to represent the original transmitted data.
A maximum likelihood (ML) decoding is a soft decision decoding which seeks to minimize a probability of word error. For a channel with additive white Gaussian noise (AWGN), a ML code word is a code word that minimizes an Euclidean distance to the soft input vector y, or equivalently which minimizes a metric
                                          metric            ⁡                          (              j              )                                =                                    1              /              2                        *                                          ∑                                  m                  =                  1                                n                            ⁢                                                          ⁢                                                d                                      m                    ,                    j                                                  ⁢                                  y                  m                                                                    ,                            (                  1          ⁢          b                )            
where dj is a j-th code word, ym is an m-th element of the soft information vector y, and dm,j is an m-th element of the j-th codeword.
Finding a most-likely code word for given soft input information can be a very complicated task; constructing an efficient decoder is thus a matter of great importance.
The value of any coding technique increases if the decoder output includes not only an accurate estimate of the original symbols but also reliability information or a confidence measure that the decoded symbols are correct. This is generally referred to herein as “soft output” information. Soft output information as to the reliability associated with each decoded bit can be useful, for example, with iterative decoding techniques.
There are very well known techniques for hard decision decoding of linear block codes. It is also well known that soft-decision decoding of a code provides a fundamental gain in performance. There are trellis-based techniques for specific codes that allow soft-decision decoding, however, for many codes the trellis representation for the code is computationally intractable due to an exceedingly large number of states required. It is important to have a decoder of a reasonable complexity that can take advantage of soft decision decoding.
A method of iterative decoding a product code that was made up from two systematic block codes was proposed in U.S. Pat. No. 5,563,897 “Method for detecting information bits processed by concatenated block codes” by R. Pyndiah, A. Glavieux, and C. Berrou.
In the method presented Pyndiah et al. determine a number, p, of least reliable positions in the received code word. The process then constructs a number, q, of binary words to be decoded from the p locations and a decision vector. The process then generates a number of code words by algebraic decoding (hard decision) decoding the decision vector of q binary words. The algorithm then generates a metric for each code word based on the Euclidean distance of the code word from the input soft information and then selects the code word with the smallest metric. The method then updates the decision vector based on the selected code word and calculates a correction vector. The correction vector is multiplied by a confidence coefficient and then added to the input vector (received samples plus previous updates). The method is limited to product codes that are formed by systematic linear binary block codes.
Another method was proposed by W. Thesling in U.S. Pat. No. 5,930,272 entitled “Block decoding with soft output information”. The method taught in '272 forms a hard decision vector, b, on the received signal samples of length n. The method then performs a hard decision decoding on the hard decisions in b to produce an error pattern, e. The result of the hard decoding is used to form a “centre” code word and the algorithm finds p nearby code words including the “centre” code word. For each of the code words taking the Hamming distance between the code word and the hard decision vector b forms a difference metric. A code word that has a minimum difference metric among the ‘nearby’ code words forms a hard decoding output. A confidence measure for each bit is formed via the difference of the difference metrics between the code word with the minimum difference metric with a ‘0’ in that position and the code word with the minimum difference metric with a ‘1’ in that position.
F. Buda and J. Fang disclose a method of “Product code iterative decoding” in U.S. Pat. No. 6,460,162. The decoder receives a code word of length n that is determined by an (n,k) linear block code from a transmission channel. The decoder inputs soft samples of the code word received from the channel and finds k most reliable signal samples. By using the k most reliable signal samples of the code word to generate m least reliable bits (where m is less than k) and makes hard decisions based on the most reliable k components of the code word. If the k most reliable signal samples cannot generate the other n−k components then there is a change in the selected k bits and the process is attempted again. Once the m bits are generated hard decisions on the k−r remaining signal samples are made. This method generates a list of code words that are close to the received code word by changing the values of the m bits. The soft output is then calculated for each bit as differences between the metrics of the selected code words.
The decoding methods of the aforementioned patents for soft-in, soft-out decoding are essentially approximate implementations of an a posteriori probability (APP) decoder. An APP decoder finds a probability of each data symbol at each symbol time given the entire received signal. Thus it also inherently provides a most likely symbol value at each symbol time given the entire received signal. This is in contrast to the well-known Viterbi algorithm, which performs maximum likelihood sequence estimation (MLSE) as discussed in A. Viterbi, “Error Bounds for Convolutional Codes and an Asymptotically optimum Decoding Algorithm”, IEEE Trans. Inform. Theory, Vol. IT-13, pp. 260–269, April 1967; and G. Fomey, “The Viterbi Algorithm”, Proc. IEEE, Vol. 61, No. 3, pp. 268–278, March 1973. That is, the Viterbi algorithm finds the entire sequence that was most likely transmitted given the received signal. Both algorithms are optimum for their respective criteria, but the APP decoding scheme more naturally provides the soft information required for iterative decoding.
Log-APP is a form of APP processing where the quantities manipulated are not probabilities, but rather “log-probability quantities” derived from probabilities. The term “log-probability quantity,” herein refers to log-probabilities, log-probabilities with offsets, sums of log-probabilities, differences of log-probabilities, and combinations of these. Note that a “log-probability” is simply a logarithm of a probability; the base of the logarithm is arbitrary.
Manipulating log-probability quantities, rather than working with the probabilities themselves, is generally preferred due to computational issues such as a finite-precision representation of numbers, and since the log-probability quantities represent information as it is defined in the field of Information Theory.
A “log-likelihood ratio” (llr) is a logarithm of a probability ratio, that is, a difference between two log-probabilities; it is a common log-probability quantity used in log-APP processing. For a binary case, the log-likelihood ratio for a received “soft” i-th sample yi related to a code symbol vi being a 0 bit is defined as:llri=log(Pr{yi=‘1’}/Pr{yi=‘0’})
where Pr{vi=‘0’} is a probability of the bit vi being a 0 bit.
For a channel with additive white Gaussian noise (AWGN), where soft input samples yi are related to original code symbols vi asyi=vi+ni,
where ni is a Gaussian noise sample with zero average, a log-likelihood ratio for a received bit is proportional to the soft input value for the bit. For example for a Gaussian channel and a BPSK modulation format the following expression holds:
      llr    i    =            (                        4          ⁢                                    E              s                                                N          0                    )        ⁢                  y        i            .      
for techniques that maximize or minimize correlative “metrics”, we can ignore the proportionality constant.
The concept of log-likelihood ratios is not restricted to a binary case and can be applied to m-ary symbols, states, and so forth. When the entities being considered are any of “m” choices, at most m−1 log-likelihood ratios are needed to fully describe the likelihoods associated with any particular entity. In a most common case of m-ary modulation m is a power of 2, i.e. m=2N where N is a number of bits in each m-ary symbol, and log-likelihood ratios can be calculated for each bit considering them separately, and only N llr's are therefore required. For example, with an 8-ary constellation each symbol represents 3 bits, and the llrs can be calculated for each the first, second and third bit.
Generally, log-APP processing amounts to adding extra information, called extrinsic information, to the input information.
The term “extrinsic information” is generally used to refer to a difference between output values and input values of a log-APP process including a max-log-APP process. For a binary code, the term extrinsic information refers to a log-likelihood ratio (or an approximation to it) for a given bit based on the log-likelihood ratios of all the other bits (excluding the given bit) and the known structure of the error correcting code.
Max-log-APP is a form of log-APP processing where some or all calculations of expressions of the form logb(bx+by) are approximated as max(x,y). The letter “b” is used to denote the base of the logarithm, which is arbitrary. The letters “x” and “y” represent the quantities being “combined”, which are typically log-probability quantities having the same base “b”. Introducing this approximation into the log-APP calculations generally results in a degradation of the results of an overall process of which the max-log-APP process is a part, but using the approximation can provide a significant reduction in computational complexity and thereby improve speed of processing. Max-log-APP processing is not, in mathematical terms, equivalent to standard log-APP processing, but is an approximation thereto.
A detailed description of APP decoding algorithms is provided in, for example, L. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal Decoding of Linear Codes for Minimizing Symbol Error Rate”, IEEE Trans. on Inform. Theory, Vol. IT-20, pp. 284–287, March 1974; P. Robertson, E. Villebrun, and P. Hoeher, “A Comparison of Optimal and Sub-Optimal MAP Decoding Algorithms Operating in the Log Domain”, Proceedings of ICC'95, Seattle, pp. 1009–1013, June 1995; P. Robertson, P. Hoeher, and E. Villebrun, “Optimal and Sub-Optimal Maximum a Posteriori Algorithms Suitable for Turbo Decoding”, European Transactions on Tele. Vol. 8, No. 2, pp. 119–125, March–April 1997; S. Pietrobon, “Implementation and Performance of a Turbo/MAP Decoder”, submitted to the International Journal of Satellite Communications, Vol. 15, No. 1, pp. 23–46, January/February 1998; J. Hagenauer, E. Offer, and L. Papke, “Iterative Decoding of Binary Block and Convolutional Codes”, IEEE Trans. on Inform Theory, Vol. 42, No. 2, pp. 429–445, March 1996; J. Erfanian, S. Pasupathy, G. Gulak, “Reduced Complexity Symbol Detectors with Parallel Structures for ISI Channels”, IEEE Trans. on Communications, Vol. 42, No. 2/3/4, pp. 1661–1671, February/March/April 1994, U.S. Pat. No. 6,145,114 in the names of Crozier, et al. The prior art max-log-APP decoding algorithm is now briefly described in the context of binary codes and an AWGN channel model; the algorithm can be however used in other systems with more complicated signaling constellations and channels.
The log-APP decoding determines for each bit position a logarithm of a ratio of likelihood that the bit is a “1” to a likelihood that the bit is a “0” given a known value of the received sample and a known code structure.
Denote a sequence of coded bits representing an entire transmitted code word as {vl}, and a corresponding sequence of noisy received samples as {yl}, where a symbol location index l varies from 1 to n. Let further di,j represent a bit at time index l for a jth code word. In vector/matrix notation, denote a jth code word as dj and the vector of received samples as y.
A bipolar mapping of the binary one-bit symbols of the code is assumed, so that logical “0” and “1” are presented at the input of the decoding process as 1 and −1, respectively.
Denote further a maximum likelihood (ML) code word under a constraint that vl=1 as a code word j, and an ML code word under a constraint that vl=−1 as a code word j′. Such code words are hereafter referred to as complimentary ML code words for a bit location l.
If the ML code words j and j′ can be efficiently determined, the log-likelihood ratio for the l-th bit given the whole received sequence is estimated in max-log-APP approximation as a difference of the metrics (1b):
                                                                                                              1                    /                    2                                    ⁢                                                            ∑                                                                        m                          =                          1                                                                          code                          ⁢                                                                                                          ⁢                          word                          ⁢                                                                                                          ⁢                          j                                                                    n                                        ⁢                                                                                  ⁢                                                                  d                                                  m                          ,                          j                                                                    ⁢                                              y                        m                                                                                            -                                                      1                    /                    2                                    ⁢                                                            ∑                                                                        m                          =                          1                                                                          code                          ⁢                                                                                                          ⁢                          word                          ⁢                                                                                                          ⁢                                                      j                            ′                                                                                              n                                        ⁢                                                                                  ⁢                                                                  d                                                  m                          ,                                                      j                            ′                                                                                              ⁢                                              y                        m                                                                                                        =                            ⁢                                                y                  k                                +                                                      ∑                                                                  m                        ≠                        l                                                                                              d                                                      m                            ,                            j                                                                          ≠                                                  d                                                      m                            ,                                                          j                              ′                                                                                                                                                                                                                          ⁢                                                            d                                              m                        ,                        j                                                              ⁢                                          y                      m                                                                                                                                              =                            ⁢                                                ll                  ⁢                                                                          ⁢                                      r                    l                    i                                                  +                                  ll                  ⁢                                                                          ⁢                                      r                    l                                          e                      ^                                                                                                                              (        2        )            
The right-hand side of the equation (2) is composite information for an l-th bit; it only involves the bit positions for which the two code words differ. This composite information vector constitutes an output of an APP algorithm.
The first term llrki of the composite information is an intrinsic information, or a log-likelihood ratio for the symbol (i.e., the noisy channel sample), which is an APP algorithm input.
The second term llrkê provides an approximation to the extrinsic information that would be obtained using true APP processing. The extrinsic information for a symbol refers to the log-likelihood ratio, or an approximation to it, for the symbol based on the log-likelihood ratios of all other symbols in the code word excluding the given symbol, and the known structure of the error correcting code.
Equation (2) provides a direct way to generate a max-log-APP decoding output from a soft input information and known ML code words. For codes that can be characterized by a trellis with a fairly small number of states, a number of algorithms, e.g., the Viterbi algorithm, are available to find the constrained ML code words. However, for more complicated codes, such as reasonably powerful block codes, it is usually prohibitively difficult. Consequently, while the max-log-APP approach is simpler than one based upon true APP, it can still be impracticably complex because of the requirement to find the ML code words.
The object of this invention is to provide an efficient soft input decoding method based on an approximate max-log-a-posteriori probability decoding approach for linear block codes that is capable of outputting soft or hard decisions on the symbols.
The method hereafter disclosed does not generate a list of ‘nearby’ code words and does not calculate the metrics using the list, as it is done in U.S. Pat. No. 5,930,272. The method does not generate metrics for ‘candidate’ code words, and does not require a search over the list to calculate the extrinsic value for the bits in the code word, as in U.S. Pat. No. 5,563,897 and U.S. Pat. No. 6,460,162. The method of present invention uses the input soft values and extrinsic information from the parity equations in a pseudo-systematic form to generate a composite information vector for the ‘most reliable’ bits. If there is a sign difference between the composite information and the current ‘best’ hard decision vector for the ‘most reliable’ bits then the hard decision vector is updated and the parity equations are again ‘pseudo-systematically’ processed to form a new set of parity equations. The new parity equations are used to re-code the symbol values in the ‘systematic’, “least reliable” portion of the parity matrix to form a new code word. In this way, the algorithm adjusts the decision vector until it converges to a code word that does not have a sign difference between the composite information and the decision vector (a property of the maximum likelihood code word). Thus, neither a finite list of candidate code words is generated nor metrics computed for each code word. The extrinsic information is calculated using the input information and the final set of parity equations. Also, the parity equations generated by this processing will always be full rank, and therefore the n−k least-reliable symbols can always be computed from the k most reliable symbols.
This method is easily vectorized. The operations are easily implemented with vector and matrices, which for certain implementations is beneficial. The computations can be performed on processors in parallel.