A fundamental problem in the field of data storage and communication is the evaluation of error-correcting codes (ECC). The general framework for the problem that our invention addresses is shown in FIG. 1. A source 110 intends to transmit a block of k bits, denoted by a vector u 111, to a destination 150. The source 110 adds redundancy bits to the source symbols by passing them through an encoder 120. The output of the encoder is a block of N bits, denoted by a vector x 121. The block of N bits pass through a channel 130, subject to noise 135, where the block is possibly corrupted into another block of N output symbols 131, denoted by a vector y. The output of the channel is then decoded 140 into a received block denoted by vector v of k bits 141 for the destination 150.
Ideally, the received block 141 will match the transmitted block 111. However, in practical conditions, some decoding failures will sometimes occur. A decoding failure called a block error occurs when at least one bit in the received block disagrees with a bit in the transmitted block. The block error rate is the probability that at least one bit of the transmitted block will be received in error, averaged over the probability distribution of transmitted blocks. In many circumstances, a better measure of merit is the bit error rate, which is the probability that any given bit will be received in error. A main objective of evaluating error-correcting codes is to determine their bit-error rates.
Error-correcting codes described by sparse generalized parity-check matrices have recently been the subject of intense theoretical interest. These types of codes were first described by R. G. Gallager, in “Low-density parity check codes,” Vol.21, Research Monograph Series, MIT Press, 1963, but were not properly appreciated until recently. In the last decade, however, a variety of improved codes defined by sparse generalized parity check matrices have been described, such as turbocodes, irregular low-density parity check (LDPC) codes, Kanter-Saad codes, repeat-accumulate codes, and irregular repeat-accumulate codes.
These improved codes have three particularly noteworthy advantages. First, the codes can be decoded efficiently using message-passing iterative decoding methods, which are sometimes called “belief propagation” (BP) methods. Second, the performance of these codes can often be theoretically evaluated using a density evolution method, at least in the infinite block-length limit. Third, by using the density evolution method, it can be demonstrated that these codes are nearly optimal codes, when decoded using BP. In particular, in the infinite block-length limit, the density evolution method typically demonstrates that BP decoding correctly recovers all data blocks that have a noise level below some threshold level, and that threshold level is often not far from the Shannon limit. For a collection of reports on such codes and their associated BP decoding method, see the Special Issue on Codes on Graphs and Iterative Algorithms, IEEE Transactions on Information Theory, February, 2001.
The density evolution method dates back to R. G. Gallager's work, already cited above. The density evolution method was re-introduced in the context of the memory-less BEC by M. Luby, M. Mitzenmacher, M. A. Shokrollahi, D. Spielman, and V. Stemann, in “Practical Loss-Resilient Codes,” Proceedings 29th Annual ACM Symposium on the Theory of Computing, 1997, pp. 150-159. The name “density evolution” was actually introduced when that method was generalized to other channels, including the memory-less binary symmetric channel (BSC), by T. Richardson and R. Urbanke in “The Capacity of Low-Density Parity Check Codes Under Message-Passing Decoding,” IEEE Trans. Inform. Theory, vol 47., pp. 599-618, February 2000.
An important drawback of the density evolution method is that it only becomes exact when the graphical representation of the code, known as the bipartite or bipartite graph, has no cycles. Fortunately, it has been shown for a variety of codes, that in the limit where the block-length N of the code approaches infinity, the presence of cycles in the bipartite graph representation can be ignored, and the results of the density evolution method become exact. Because all the best-performing codes do have cycles in their bipartite graph representations, this means that in practice, application of the density evolution method is restricted to some codes in the limit where their block-length N approaches infinity.
Given a method to measure the performance of a code, one can design error-correcting codes that optimize the performance. A preferred prior art way for designing improved codes that will be decoded with BP decoding, has been to optimize certain classes of codes for the infinite block-length limit using the density evolution method, and hope that a scaled-down version still results in a near optimal code. See, for example, T. Richardson, and R. Urbanke, in “Design of Capacity-Approaching Irregular Low-Density Parity-Check Codes,” IEEE Trans. Inform. Theory, vol. 47., pp. 619-637, February 2000.
The problem with this method is that even for very large block-lengths, such as blocks of length N<104, one is still noticeably far from the infinite-block-length limit. In particular, many decoding failures are found at noise levels far below the threshold level predicted by infinite block-length calculations. Furthermore, there may not necessarily even exist a way to scale down the codes derived from the density evolution method. For example, the best known irregular LDPC codes, at a given rate in the N→∞ limit, often have bits that participate in hundreds or even thousands of parity checks, which makes no sense when the overall number of parity checks is 100 or less.
Codes with intermediate block-length, for example block-length less than 104, are important for many applications. Therefore, there is a need for a practical method to directly evaluate the performance of arbitrary intermediate block-length error-correcting codes as decoded by BP decoding methods. Aside from its utility as a part of a code design method, such a method could be used for code verification. The performance of BP decoding of parity-check codes is currently normally judged by “Monte Carlo” simulations, which randomly generate thousands or millions of noisy blocks.
Unfortunately, such simulations become impractical as a code-verification technique when the decoding failure rate is required to be extraordinarily small, as in, for example, magnetic disk drive or fiber-optical channel applications. This is a serious problem in evaluating turbo-codes or LDPC codes, which often suffer from an “error-floor” phenomenon, which is hard to detect if the error-floor is at a sufficiently low decoding failure rate.
Furthermore, it is desired that the method can be used when the channel is a binary erasure channel or a binary symmetric channel. Therefore, those channels, parity-check codes, iterative decoding methods, and the density evolution method are described in greater detail.
The Binary Erasure Channel (BEC) and Binary Symmetric Channel (BSC)
A binary erasure channel (BEC) is a binary input channel with two input symbols, 0 and 1, and with three output symbols: 0, 1, and an erasure, which can be represented by a question mark “?.” A bit that passes through the channel will be received correctly with probability 1−x, and will be received as an erasure with probability x.
A binary symmetric channel (BSC) is a binary input channel with two input symbols, 0 and 1, and with two output symbols: 0, and 1. A bit will pass through the channel and be correctly received in its transmitted state with probability 1−x, and will be incorrectly inverted into the other state with probability x.
The method should be applicable to a memory-less version of the BEC and BSC. In a memory-less channel, each bit is erased or inverted independently of every other bit. Many practical channels are memory-less to a good approximation. In any case, the memory-less BEC and memory-less BSC are excellent test-beds for evaluating and designing new error-correcting codes, even when they are not always realistic practical models.
One could assume that the probability of erasure for the BEC or the inversion probability for the BSC is identical for every bit, because that is the normal realistic situation. However, it will be convenient to let the erasure probability or inversion probability depend explicitly on the bit position within the block. Thus, the bits in a block are indexed with the letter i, and the erasure probability in the BEC, or the inversion probability in the BSC, of the ith bit is taken to be xi. The probability xi for all the bits can ultimately be set equal at the end of the analysis, if so desired.
Parity Check Codes
Linear block binary error-correcting codes can be defined in terms of a parity check matrix. In a parity check matrix A, the columns represent transmitted variable bits, and the rows define linear constraints or checks between the variable bits. More specifically, the matrix A defines a set of valid vectors or codewords z, such that each component of z is either 0 or 1, andAz=0,  (1)where all multiplications and additions are modulo 2.
If a parity check matrix has N columns and N−k rows, then the matrix usually defines an error-correcting code of block-length N, and transmission rate k/N. If some of the rows are linearly dependent, then some of the parity checks will be redundant and the code will actually have a higher transmission rate.
As shown in FIG. 2, there is a corresponding bipartite graph for each parity check matrix, see R. M. Tanner, “A recursive method to low complexity codes,” IEEE Trans. Info. Theory, IT-27, pages 533-547, 1981. A Tanner graph is a bipartite graph with two types of nodes: variable nodes i (1-6) denoted by circles 201, and check nodes A, B, and C denoted by squares 202. In a bipartite graph, each check node is connected to all the variable nodes participating in the check. For example, the parity check matrix                     A        =                  (                                                    1                                            1                                            0                                            1                                            0                                            0                                                                    1                                            0                                            1                                            0                                            1                                            0                                                                    0                                            1                                            1                                            0                                            0                                            1                                              )                                    (        2        )            is represented by the bipartite graph shown in FIG. 2.
In practical applications, the graphs representing codes typically include thousands of nodes connected in any number of different ways, and contain many loops (cycles). Evaluating codes defined by such graphs, or designing codes that perform optimally, is very difficult.
Error-correcting codes defined by parity check matrices are linear. This means that each codeword is a linear combination of other codewords. In a check matrix, there are 2k possible codewords, each of length N. For the example given above, the codewords are 000000, 001011, 010110, 011101, 100110, 101101, 110011, 111000. Because of the linearity property, any of the codewords are representative, given that the channel is symmetric between inputs 0 and 1, as in the BEC or BSC. For the purposes of evaluating a code, it is normally assumed that the all-zeros codeword is transmitted.
Generalized Parity Check Matrices
Generalized parity check matrices define many of the modem error-correcting codes, such as turbo-codes, Kanter-Saad codes, and repeat-accumulate codes. In a generalized parity check matrix, additional columns are added to a parity check matrix to represent “hidden” variable nodes. Hidden variable nodes participate in parity checks and help constrain the possible code-words of a code, but they are not sent through the channel. Thus, the receiver of a block must decode the bit values without any direct information—in a sense, one can consider all hidden nodes to arrive “erased.” The advantage of hiding variable nodes is that one improves the transmission rate of the code. A good notation for the hidden state variables is a horizontal line above the corresponding columns in the parity-check matrix, e.g., one can write                     A        =                  (                                                                      1                  _                                                            1                                            0                                            1                                            0                                            0                                                                    1                                            0                                            1                                            0                                            1                                            0                                                                    0                                            1                                            1                                            0                                            0                                            1                                              )                                    (        47        )            to indicate a code where the first variable node is a hidden node. To indicate that a variable node is a hidden node, an open circle is used, rather than a filled-in circle. Such a graph, which generalizes bipartite graphs, is called a “Wiberg graph,” see N. Wiberg, “Codes and decoding on general graphs,” Ph. D. Thesis, University of Linkoping,” 1996, and N. Wiberg et al., “Codes and iterative decoding on general graphs,” Euro. Trans. Telecomm, Vol. 6, pages 513-525, 1995.Iterative Message-Passing Decoding
It is desired to provide a method for evaluating the performance of an iterative message-passing decoder for a code where each possible state of each message can take on a finite number of discrete values. Good examples of such decoding methods are the belief propagation (BP) decoding method for the BEC, and the “Gallager A” decoding method for the BSC, describe in greater detail below. Other examples of such decoders are the quantized belief propagation decoders described in detail by T. Richardson and R. Urbanke in “The Capacity of Low-Density Parity Check Codes Under Message-Passing Decoding,” IEEE Trans. Inform. Theory, vol 47, pp. 599-618, February 2000.
Belief Propagation Decoding in the BEC
It is important to note that the BEC never inverts bits from 0 to 1, or vice versa. If all-zeros codewords are transmitted, the received word must therefore consist entirely of zeros and erasures. For the case of the BEC, BP decoding works by passing discrete messages between the nodes of the bipartite graph. Each variable node i sends a message mia to each connected check node a. The message represents the state of the variable node i. In general, the message can be in one of three states: 1, 0, or ?, but because the all-zeros codeword is always transmitted, the possibility that mia has a bit value of one can be ignored.
Similarly, there is a message mai sent from each check node a to all the variable nodes i connected to the check node. These messages are interpreted as directives from the check node a to the variable node i about what state the variable node should be in. This message is based on the states of the other variable nodes connected to the check node. The check-to-bit messages can, in principle, take on the bit values 0, 1, or ?, but again only the two messages 0 and ? are relevant when the all-zeros codeword is transmitted.
In the BP decoding process for the BEC, a message mia from a variable node i to a check node a is equal to a non-erasure received message because such messages are always correct in the BEC, or to an erasure when all incoming messages are erasures. A message mai from a check node a to a variable node i is an erasure when any incoming message from another node participating in the check is an erasure. Otherwise, the message takes the value of the modulus-2 sum of all incoming messages from other nodes participating in the check.
BP decoding is iterative. The iterations are indexed by an integer t, which must be greater than or equal to one. At the first iteration, when t=1, the variable-to-check node messages are initialized so that all variable nodes that are not erased by the channel send out messages equal to the corresponding received bit. Then, the check-to-variable messages are determined by the standard rule mentioned above. At the end of the first iteration, a variable node can be considered decoded if any of its incoming messages is a non-erasure. Such messages must always be correct, so the bit is decoded to the value indicated by the message.
At each subsequent iteration, one first updates all the messages from variable nodes to check nodes, and then one updates all the messages from check nodes to variable nodes, and then checks each bit to see whether it has been decoded. One stops iterating when some criterion is reached, for example, after a fixed number of iterations, or after the messages converge to stationary states. For the particularly simple BEC, messages can only change under the BP decoding process from erasure messages to non-erasure messages, so the iterative decoding process must eventually converge.
The “Gallager A” Decoding Method for the BSC
The “Gallager A” decoding method for the BSC was first described by R. G. Gallager, in “Low-density parity check codes,” Vol.21, Research Monograph Series, MIT Press, 1963. It works as follows. As in BP decoding for the BEC, there are two classes of messages: messages from variable nodes to check nodes; and messages from check nodes to the variable nodes. However, the meaning of the messages is slightly different.
The decoding method is initialized by each variable node sending a message to every connected check node. The message is 0 or 1 depending on the bit value that was received through the channel. In turn, each check nodes send a message to connected variable nodes. This message is 0 or 1 and is interpreted as commands about the state that the variable nodes should be in. In particular, the message is the modulus-2 sum of the messages that the check node receives from the other variable nodes to which it is connected.
In further iterations of the Gallager A decoding method, each variable node continues to send the received bit value to the connected check nodes, unless the variable node receives sufficient contradictory messages. In particular, if all the other connected check nodes, aside from the check node that is the recipient of the message, send a variable node a message that contradicts the bit value it received from the channel, then the variable node sends the message it received from all the other check nodes.
The Gallager A decoding method is iterated until some criterion, like a fixed number of iterations, is reached. At every iteration, each bit is decoded to the bit value that the variable node receives from the channel, unless all the incoming messages from the connected check nodes agree on the other bit value.
Density Evolution
Density evolution is a method for evaluating a parity check code that uses iterative message-passing decoding as described above. Specifically, density evolution can determined the average bit error rate of a code. Density evolution is now described for the case of BP decoding in the BEC. Similar density evolution methods have been derived for other iterative message-passing decoders such that each message can only be in a finite number of discrete states, for example the Gallager A decoding method, as described above, or the quantized belief propagation decoders. In general, the density evolution methods are represented as sets of rules relating the probabilities that each of the messages used in the decoder is in each of its states.
For the case of the density evolution method for BP decoding in the BEC, a probability, averaged over all possible received blocks, that each message is an erasure, is considered. The iterations are indexed by an integer t. A real number pia(t), which represents the probability that a message mia is an erasure at iteration t, is associated with each message mia from variable nodes to check nodes. Similarly, a real number qai(t), which represents the probability that the message mai is an erasure at iteration t, is associated with each message mai from check nodes to variable nodes. In the density evolution method, probabilities Pia(t) and qai(t) are determined in a way that is exact, as long as the bipartite graph representing the error-correcting code has no loops.
A “rule” that determines the probability pia(t) is                                                         p                              i                ⁢                                                                   ⁢                a                                      ⁡                          (                              t                +                1                            )                                =                                    x              i                        ⁢                                          ∏                                  b                  ∈                                                            N                      ⁡                                              (                        i                        )                                                              ⁢                    \                    ⁢                    a                                                                                                                 ⁢                                                q                                      b                    ⁢                                                                                   ⁢                    i                                                  ⁡                                  (                  t                  )                                                                    ,                            (        3        )            where jεN(i)\a represents all check nodes directly connected to a neighboring variable node i, except for the check node a. Note, that in density evolution, this rule includes operands x and q, and a multiplication operator.
This rule can be derived from the fact that for a message mia to be an erasure, the variable node i must be erased during transmission, and all incoming messages from other check nodes must be erasures as well. Of course, if the incoming messages qai(t) are statistically dependent, then the rule is not correct. However, in the density evolution method, such dependencies are systematically ignored. In a bipartite graph with no loops, each incoming message is in fact independent of all other messages, so the density evolution method is exact.
Similarly, the rule                                           q                          a              ⁢                                                           ⁢              i                                ⁡                      (            t            )                          =                  1          -                                    ∏                              j                ∈                                                      N                    ⁡                                          (                      a                      )                                                        ⁢                  \                  ⁢                  i                                                                                                   ⁢                          (                              1                -                                                      p                                          j                      ⁢                                                                                           ⁢                      a                                                        ⁡                                      (                    t                    )                                                              )                                                          (        4        )            can be derived from the fact that a message mai can only be in a non-erasure state when all incoming messages are in a non-erasure state, again ignoring statistical dependencies between the incoming messages pja(t).
The density evolution rules (3) and (4) are evaluated by iteration. The appropriate initialization is pia(t=1)=xi for all messages from variable nodes to check nodes. At each iteration t, it is possible to determine bi(t), which is the probability of a failure to decode at variable node i, from the rule                                                         b              i                        ⁡                          (              t              )                                =                                    x              i                        ⁢                                          ∏                                  a                  ∈                                      N                    ⁡                                          (                      i                      )                                                                                                                                     ⁢                                                q                                      a                    ⁢                                                                                   ⁢                    i                                                  ⁡                                  (                  t                  )                                                                    ,                            (        5        )            In other words, the rules (3, 4, and 5) enable one to evaluate the code in terms of its bit error rate.Exact Solution of a Small Code
As stated above, the density evolution rules (3, 4, and 5) are exact when the code has a bipartite graph representation without loops. It is very important to understand that the density evolution rules are not exact when a bipartite graph represents a practical code that does have loops, because in that case, the BP messages are not independent, in contradiction with the assumptions underlying rules (3, 4, and 5).
Consider, as an example of a bipartite graph with no loops, the error-correcting code defined by a parity check matrix                     A        =                  (                                                    1                                            1                                            0                                            0                                                                    0                                            1                                            1                                            1                                              )                                    (        6        )            and represented by a corresponding bipartite graph shown in FIG. 2. This code has four codewords: 0000, 0011, 1101, and 1110. If the 0000 message is transmitted, then there are sixteen possible received messages: 0000, 000?, 00?0, 00??, 0?00, and so on. The probability of receiving a message with ne erasures is xnt(1−x)4−n3, where we have taken all the xi, to be equal to the same value x.
It is easy to determine the exact probability that a given bit remains an erasure after t iterations of decoding have completed by summing over the decoding results for all the sixteen possible received messages weighted by their probabilities. For example, after decoding to convergence, the first bit will only fail to decode to a 0 when one of the following messages is received: ???0, ??0?, or ????, so the exact probability that the first bit will not decode to a 0 is 2x3(1−x)+x4=2x3−x4.
If the focus is on the last bit, then the bit will ultimately be correctly decoded, unless one of the following messages is sent: 00??, 0???, ?0??, ??0? or ????. Therefore, the overall probability that the fourth bit is not correctly decoded is x2(1−x)2+3x3(1−x)+x4=x2+x3−x4.
In the density evolution method, applied to this code, the values for the following variables:p11(t), p21(t), p22(t), p32(t), p42(t), q11(t), q12(t), q22(t), q23(t), q24(t), b1(t), b2(t), b3(t), b4(t),are determined byp11(t)=x  (7)p21(t+1)=xq22(t)  (8) p22(t+1)=xq12(t)  (9)p32(t)=x  (10)p42(t)=x  (11)q11(t)=p21(t)  (12)q12(t)=p11(t)  (13)q22(t)=1−(1−p32(t))(1−p42(t))  (14)q23(t)=1−(1−p22(t))(1−p42(t))  (15)q24(t)=1−(1−p22(t))(1−p42(t))  (16)andb1(t)=xq11(t)  (17)b2(t)=xq12(t)q22(t)  (18)b3(t)=xq23(t)  (19)b4(t)=xq24(t)  (20)with the initial conditions thatp11(t=1)=p21(t=1)=p22(t=1)=p32(t=1)=p42(t=1)=x.  (21)
Solving these rules yields the exact bit error rates at every iteration. For example, one can find that b4(t=1)=2x2−x3 and b4(t≧2)=x2+x3−x4. These results correspond to the fact that the 00??, 0?0?, 0???, ?0??, ??0? and ???? messages will not be decoded to a zero at the fourth bit after the first iteration so that the probability of decoding failure at the fourth bit is 2x2(1−x)2+3x3(1−x)+x4=2x2−x3; but after two or more iterations, the 0?0? is decoded correctly, so the probability of decoding failure at the fourth bit is x2(1−x)2+3x3(1−x)+x4=x2+x3−x4.
The Large Block-Length Limit
If all local neighborhoods in the bipartite graph are identical, the density evolution rules can be simplified. For example, consider a regular Gallager code, which is represented by a sparse random parity check matrix characterized by the restriction that each row contains exactly dc, ones, and each column contains exactly dv, ones. In that case, it can be assumed that all the pia(t) are equal to the same value p(t), all the qai(t) are equal to the same value q(t), and all the bi(t) are equal to the same value b(t). Then,p(t+1)=xq(t)dv−1  (22)q(t)=1−(1−p(t))dc−1  (23)andb(t)=xq(t)dv,  (24)which are the density evolution rules for (dv, dc) regular Gallager codes, valid in the N→∞ limit.
The intuitive reason that these rules are valid, in the infinite block-length limit, is that as N→∞, the size of typical loops in the bipartite graph representation of a regular Gallager code go to infinity. As a result, all incoming messages to a node are independent, and a regular Gallager code behaves like a code defined on a graph without loops. Solving rules (22, 23, and 24) for specific values of dv and dc yields a solution p(t→∞)=q(t→∞)=b(t→∞)=0, below a critical erasure value, known as the “threshold,” of xc. This means that decoding is perfect for erasure probabilities below the threshold. Above xc, b(t→∞) has a non-zero limit, which correspond to decoding failures. The value xc is easy to determine numerically. For example, if dv=3 and dc=5, then xc≈0.51757.
These determinations of the threshold at infinite block-length can be generalized to irregular Gallager codes, or other codes like irregular repeat-accumulate codes that have a finite number of different classes of nodes with different neighborhoods. In this generalization, one can derive a system of rules, typically with one rule for the messages leaving each class of node. By solving the system of rules, one can again find a critical threshold xc, below which decoding is perfect. Such codes can thus be optimized, in the N→∞ limit, by finding the code that has maximal noise threshold xc. This is the way the prior-art density evolution method has been utilized to evaluate error-correcting codes.
Drawbacks of the Density Evolution Method
Unfortunately, the conventional density evolution method is erroneous for codes with finite block-lengths whose graphical representation has loops. One might think that it is possible to solve rules (3, 4 and 5) for any finite code, and hope that ignoring the presence of loops is not too important a mistake. However, this does not work out, as can be seen by considering regular Gallager codes. Rules (3, 4, and 5) for a finite block-length regular Gallager code have exactly the same solutions as one would find in the infinite-block-length limit, so one would not predict any finite-size effects. However, it is known that the real performance of finite-block-length regular Gallager codes is considerably worse than that predicted by such a naive method. In practice, the application of the density evolution method is limited to codes whose block-length is very large, so that the magnitude of the error is not too great.
Therefore, there is a need for a method to correctly evaluate finite length error-correcting codes that do not suffer from the problems of the prior art evaluation methods.