1. Field of the Invention
Embodiments of the present invention relate, in general, to iterative decoding and particularly to enabling an additional iterative decoding path so as to achieve convergence.
2. Relevant Background
Data communication systems have been under continual development for many years. One such type of communication system and area of development employs LDPC (Low Density Parity Check) codes. A primary directive in this area has been to lower the error floor within a communication system. The ideal goal has been to try to reach Shannon's limit in a communication channel. Shannon's limit may be viewed as being the data rate to be used in a communication channel, having a particular SNR (Signal to Noise Ratio), that achieves error free transmission through the communication channel. In other words, the Shannon limit is the theoretical bound for channel capacity for a given modulation and code rate.
LDPC code has been shown to provide for excellent decoding performance that can approach the Shannon limit in some cases. For example, some LDPC decoders have been shown to come within 0.3 dB (decibels) from the theoretical Shannon limit. While this example was achieved using an irregular LDPC code of a length of one million, it nevertheless demonstrates the very promising application of LDPC codes within communication systems and data storage systems.
According to this approach, a relatively sparse code matrix is defined, such that the product of this matrix with each valid codeword (information and parity bits) equals the zero vector. Decoding of an LDPC coded message to which channel noise has been added in transmission amounts to finding the sparsest vector that, when used to multiply the sparse code matrix, matches the received sequence. This sparsest vector is thus equal to the channel noise (because the matrix multiplied by the true codeword is zero), and can be subtracted from the received sequence to recover the true codeword.
It has become well known in the art that iterative decoding of LDPC codes provides excellent decoding performance, from the standpoint of latency and accuracy, with relatively low hardware or software complexity. Iterative approaches are also quite compatible with turbo codes, LDPC codes, and many other FECC codes known in the art.
Typically, iterative decoding involves the communicating, or “passing”, of reliability, or “soft output”, values of the codeword bits over several iterations of a relatively simple decoding process. Soft output information includes, for each bit, a suspected value of the bit (“0” or “1”), and an indication of the probability that the suspected value is actually correct. For the initial decoding of an incoming input, these a-priori probabilities are simply initialized to a neutral value (i.e., no knowledge of their likelihood) or the values from the channel detector (e.g., SOVA).
The decoding continues for a number of iterations, until some termination or convergence criterion is reached. Termination of the iterations may be based on a data-dependent convergence criterion. For example, the iterations may continue until there are no bit changes from one iteration to the next, at which point convergence may be assumed because the bits will then tend to reinforce their probabilities. Typically, conventional communications equipment performs a pre-selected number of iterations without regard to the results, with the number of iterations selected by way of experimentation or characterization.
One process well known in the art is the iterative operation of a “belief propagation” approach to LDPC decoding. In its conventional implementation, the belief propagation algorithm uses two value arrays, a first array storing the log-likelihood-ratios (LLRs), for each of j input nodes corresponding to the bits in the codeword; this array is also referred to in the art as the array of “variable” nodes. A second array stores the results of m parity check node updates; this array is also referred to as the array of “checksum” nodes. A graphical representation of these two arrays can be seen in FIG. 1 in what is commonly referred to in the prior art as a Tanner graph. In FIG. 1 the checksum nodes, or c-nodes 110 (collectively), are represented by 4 blocks f0-3 1100-3. Variable nodes, or v-nodes 120 (collectively) are represented by 8 circles c0-7 1200-7. The values m and j typically differ from one another; with typically many more codeword bits j than there are checksum equations m.
As shown by the lines in FIG. 1, information is communicated back and forth between the variable nodes 120 and the checksum nodes 110 in each iteration of this LDPC belief propagation approach (also referred to as “message passing”). In its general operation, in a first decoding step, each of the variable nodes 120 communicate the current LLR value for its codeword bit to each of the checksum nodes 110 that it participates in. Each of the checksum nodes 120 then derives a check node update for each LLR value that it receives, using the LLRs for each of the other variable nodes 120 participating in its equation. As mentioned above, the parity check equation for LDPC codes requires that the product of the parity matrix with a valid codeword is zero. Accordingly, for each variable node 120, checksum node 110 determines the likelihood of the value of that input that will produce a zero-valued product; for example, if the five other inputs to a checksum node that receives six inputs are strongly likely to be a “1”, it is highly likely that the variable node under analysis is also a “1” (to produce a zero value for that matrix row). The result of this operation is then communicated from checksum nodes 110 to its participating variable node 120. In the second decoding step, the variable node 120 updates its LLR probability value by combining, for its codeword bit, the results for that variable node 110 from each of the checksums in which that input node participated. This two-step iterative approach is repeated until a convergence criterion is reached, or until a terminal number of iterations have been executed.
As known in the art, other iterative coding and decoding approaches are known. But in general, each of these iterative decoding approaches generates an output that indicates the likely data value of each codeword bit, and also indicates a measure of confidence in that value for that bit (i.e., probability).
FIG. 2 is a prior art histogram 200 showing the number of iterations required to achieve convergence of 50,000 blocks of encoded data. In FIG. 2 the horizontal axis 210 represents the number of iterations to achieve convergence and the vertical axis 220 represents the number of blocks that achieved convergence at any particular number of iterations. As shown, the majority of the blocks converged during 4-6 iterations 230. The histogram also demonstrates that only a small portion of the total number of blocks require more than 15 iterations, 250 to converge. While as the performance of convergence continues to increase as the number of iterations increases, this broadening of the number of maximum iterations comes at a cost of increased decoding complexity.
FIG. 3 is a logarithmic line graph 300 showing the relationship of block error rate and iterations to convergence, as is known in the prior art. As one would expect, the block error rate, represented on the vertical axis 320 of the graph, decreases as the number of iterations, represented on the horizontal axis 310, increases. One should note the logarithmic nature of the vertical axis, block error rate. Correlating this data with that, FIG. 2 shows that the decrease in error rate changes at a point 330 approximately between 6-7 iterations. This point is representative of the diminishing returns of excessive iterations. More iterations deliver a lower error rate but at a higher cost.
As mentioned above, iterative decoders can provide excellent performance at reasonable complexity from a circuit or software standpoint. However, the decoding delay, or latency, depends strongly on the number of decoding iterations that are performed. It is known, particularly for parallel concatenated convolutional codes (PCCCs), that this latency may be reduced by parallelizing the decoding functions. For an example of a two-stage decoder requiring five iterations, it is possible to implement ten actual binary convolutional code decoders, each of which operates on one-tenth of the Viterbi trellis for the decoding. It has been observed that such parallelization can provide essentially no performance loss, while greatly reducing the decoding latency in the system. However, the hardware required for such parallelization is substantial (e.g., 10× for this five iteration example).
Accordingly, the architects of decoding systems are faced with optimizing a tradeoff among the factors of decoding performance (bit error rate), decoding latency or delay, and decoder complexity (cost). The number of iterations is typically determined by the desired decoder performance, following which one may trade off decoding delay against circuit complexity, for example by selecting a parallelization factor.