The present invention relates to iterative decoding and, more particularly, to a fast method of detecting when to terminate the decoding, either because the iterations have converged to a valid codeword or because the iterations are likely to not converge to a valid codeword, and a decoder for implementing the method.
Flash memory has become increasingly popular in recent years. Flash memory is used in numerous applications including mobile phones, digital cameras, MP players and many other applications. A major emerging application is the use of flash memory as Solid State Disc (SSD). In order to be cost efficient, it is desirable to implement such memories using high density Multi-Level Cell (MLC) memories, and to minimize the required redundancy that is needed for ensuring data reliability and integrity. This requires usage of advanced Error Correction Coding (ECC) schemes, such as state of the art iterative coding schemes based on Low-Density Parity-Check (LDPC) or Turbo codes.
Error correction codes are commonly used in memories in order to ensure data reliability and integrity, by dealing with errors that are introduced by the physical medium during its programming or reading or during the storage time. An error correction code is a set of codewords that satisfy a given set of constraints. One commonly used class of error correction codes is the class of binary linear block codes, in which the code is defined through a set of parity-check constraints on the codeword bits. In other words, a binary linear block code is defined by a set of linear equations over the two-element field GF(2) that a valid codeword should satisfy. The set of linear equations can be conveniently described via a parity-check matrix H of M rows, such that each row of the matrix defines one parity-check constraint and a word C constitutes a valid codeword if and only if H·C=0 (over GF(2)). The vector S=H·C is commonly known as the syndrome vector associated with the word C. In the appended claims, this syndrome is called the “error correction” syndrome to distinguish it from a different syndrome, the “CRC” or “checksum” syndrome, that is defined below. Each element of the syndrome vector is associated with one of the parity check equations, and the value of the element is 0 for an equation that is satisfied by C and 1 for an equation that is not satisfied by C. The elements of the syndrome vector also are called “bits” of the syndrome vector herein. The syndrome weight (WS) is the number of unsatisfied equations represented by the syndrome vector S. So, for a word to be a valid codeword the syndrome vector associated with the word must be all zeros and its syndrome weight must be 0.
State of the art error correction codes are based on iterative coding schemes, such as LDPC and Turbo codes. In iterative coding schemes, decoding is performed using an iterative algorithm that iteratively updates its estimates of the codeword bits until the algorithm converges to a valid codeword. The iteratively updated estimates can be either “hard” estimates (1 vs. 0) or “soft” estimates, which are composed of an estimate of the bit's value (1 or 0), together with some reliability measure of the estimate indicating the probability that the estimated value is correct. The most commonly used soft estimate is the Log Likelihood Ratio (LLR), the ratio of the probability of the bit being 0 to the probability of the bit being 1. A positive LLR means that the bit is estimated to be more likely to be 0 than 1. A negative LLR means that the bit is estimated to be more likely to be 1 than 0. The absolute value of the LLR is an indication of the certainty of the estimate. In the appended claims, that an estimate of a bit “flips” means that the value of the bit estimate changes: for example, a hard estimate changes from 0 to 1 or from 1 to 0, or the sign of a LLR changes from positive to negative or from negative to positive. (Similarly, in the appended claims, “flipping” a bit of a syndrome vector means changing the bit from 1 to 0 or from 0 to 1.) The decoder is initialized with initial a-priori (possibly “soft”) estimates of the bits. These estimates are then processed and updated iteratively. The decoding can terminate after a fixed number of iterations. Alternatively, a convergence detection mechanism can terminate the decoding once all the parity check constraints are satisfied by the current bit estimates.
Another option for early decoding termination is by a “divergence” detection mechanism, which detects that the probability for decoder convergence is low and hence it is more efficient to terminate the current decoding attempt and retry decoding after updating the decoder initialization values. One option for performing such divergence detection is based on the current number of unsatisfied parity-check constraints being too high. Another option for divergence detection is based on the evolution of the number of unsatisfied parity-checks during decoding. In case of such early termination, the decoding may be repeated with updated initialization values, after changing certain parameters, such as the memory reading thresholds or reading resolution, such that the probability of successful decoding convergence in the repeated attempt is increased.
Referring now to the drawings, FIG. 1 shows a block diagram of an exemplary iterative decoder 10. The initial bit estimates are stored in a bit estimates RAM 12. A ROM 14 is used for storing the code description. For example, ROM 14 may store which bits participate in each parity check constraint (i.e. ROM 14 stores the parity check matrix H that defines the code). The bit estimates are read from bit estimates RAM 12 through a routing layer 16 into several processing units 18. Code description ROM 14 controls the routing of the bit estimates into processing units 18. Processing units 18 update the bit estimates based on the parity-check constraints that the bits should satisfy. A scratchpad RAM 20 may be used by processing units 18 for storing temporary data required for updating the bit estimates. The updating of the bit estimates is done iteratively, one or more bit estimates at a time, where an iteration may involve updating the bit estimates based on all the parity-check constraints that the bit estimates should satisfy (i.e. “traversing” code description ROM 14 once). Decoding can terminate after a predetermined number of iterations or according to a convergence signal generated by a convergence detection block 22, once convergence detection block 22 detects that all the parity check constraints are satisfied by the current bit estimates (for example, by testing whether the syndrome weight is zero).
More formally, a decoding “iteration” is defined herein as considering each of the parity-check equations that define the code, and updating the estimates of the codeword bits that are associated with each parity-check equation, according to a certain schedule, until all the parity check equations have been considered. For example, LDPC decoding usually is formulated as message passing among the nodes of a “Tanner graph” whose edges connect nodes that represent the codeword bits with nodes that represent parity-checks that the codeword bits should satisfy. Examples of message-passing schedules for LDPC decoding on a Tanner graph include the following:
1. Traverse all the parity-check nodes, passing messages from each parity-check node to the codeword bit nodes to which that parity-check node is connected by edges of the graph. Update the codeword bit estimates according to the messages received at the codeword bit nodes. Then traverse all the codeword bit nodes, passing messages from each codeword bit node to the parity-check nodes to which that codeword bit node is connected by edges of the graph. Update the parity-check bit estimates according to the messages received at the parity-check nodes.
2. Traverse all the codeword bit nodes, passing messages from each codeword bit node to the parity-check nodes to which that codeword bit node is connected by edges of the graph. Update the parity-check bit estimates according to the messages received at the parity-check nodes. Then traverse all the parity-check nodes, passing messages from each parity-check node to the codeword bit nodes to which that parity-check node is connected by edges of the graph. Update the codeword bit estimates according to the messages received at the codeword bit nodes.
3. Traverse all the parity-check nodes. At each parity-check node, pass messages to the parity-check node from the codeword bit nodes that are connected to is that parity check node by edges of the graph, update the parity-check bit estimate according to the messages received at the parity-check node, send messages back from the parity-check node to those codeword bit nodes, and update the codeword bit estimates at those codeword bit nodes according to the messages received from the parity check node.
4. Traverse all the codeword bit nodes. At each codeword bit node, pass messages to the codeword bit node from the parity-check nodes that are connected to that codeword bit node by edges of the graph, update the codeword bit estimate according to the messages received at the codeword bit node, send messages back from the codeword bit node to those parity-check nodes, and update the parity-check bit estimates at those parity-check nodes according to the messages received from the codeword bit node.
As defined herein, an “iteration” is not over until its associated schedule has been completed.
Flash memories intended for applications such as SSD and mobile require very high random I/O performance. During reading, this implies usage of very fast ECC decoders. In order to achieve fast decoding in iterative coding schemes a fast convergence detection apparatus is needed. The advantage of using convergence detection block 22 is that it leads to faster decoding time (due to early termination) and lower energy consumption by decoder 10.
One common method for convergence detection in iterative decoders is to compute the syndrome vector S=H·Ĉ at the end of each decoding iteration (where Ĉ is the vector of bit estimates at the end of the iteration) and check whether all the parity-checks are satisfied (i.e. whether the syndrome weight is zero). The disadvantage of this approach is that dedicated processing is done at the end of each iteration in order to compute the syndrome vector. This prolongs the decoding time and comes at the expense of decoding iterations.
Another approach, commonly used in iterative decoders that are based on serial schedules in which the parity-check equations of the code are processed one after another, is to perform semi-on-the-fly convergence detection. According to this approach, a counter holding the number of satisfied parity-checks is maintained. At the beginning of decoding this counter is set to zero. During decoding the code's parity-checks are traversed serially and iteratively and the bit estimates are updated based on each of the parity-checks. As part of this decoding process, the syndrome bit of each parity-check is computed when the parity-check is traversed. If the syndrome bit is zero (i.e. the parity-check is satisfied) then the counter is incremented, otherwise, the counter is reset to zero. The counter is also reset to zero each time one of the codeword bits changes value, because previously computed syndrome bits are not valid anymore. Once the counter reaches M (recall that M is the number of parity-check equations that the codeword should satisfy, which is the dimension of H), convergence is detected and decoding is terminated.
This semi-on-the-fly convergence detection mechanism is very simple. However, its drawback is that it provides delayed convergence detection, as it detects the convergence a full iteration after the decoder has converged to a valid codeword. The reason is that we need to count a full number of satisfied parity-checks after the last bit to flip flips its value (as value flipping resets the counter). In a high-error scenario, such as decoding data read from a flash memory long after the data were stored and/or after the flash memory has endured many write/erase cycles, several iterations (e.g. ten or more iterations) normally are required for convergence, so adding one more iteration after the last bit flips adds no more than 10% to the convergence time. However, in low-error environments such as a fresh flash memory, one or two iterations normally suffice for decoding a codeword, so that adding a full iteration after the last bit flip can add a significant 50% to 100% to the decoding time.