The present invention relates to encoding and, more particularly, to methods and devices for encoding data using an innovative parity-check matrix structure.
Consider a linear block code defined by a generator matrix Γ. To encode an information vector b, Γ is right-multiplied by b to produce a codeword vector c:c=bΓ  (1)
Associated with Fare one or more parity-check matrices H that satisfy the matrix equationHc=0  (2)for all the codeword vectors c of the code, i.e. a vector c belongs to the code if the vector satisfies equation (2). Typically, Γ, H and c are defined over the field GF(2), i.e. the elements of Γ, H and c are 0 or 1, and the addition of field elements is done as integer addition modulo 2.
A LDPC code is a linear binary block code whose parity-check matrix or matrices H is/are sparse. As shown in FIG. 1, a LDPC parity check matrix H is equivalent to a sparse bipartite “Tanner graph” G=(V,C,E) with a set V of N bit nodes (N=13 in FIG. 1), a set C of M check nodes (M=10 in FIG. 1) and a set E of edges (E=38 in FIG. 1) connecting bit nodes to check nodes. The bit nodes correspond to the codeword bits and the check nodes correspond to parity-cheek constraints on the bits. A bit node is connected by edges to the check nodes that the bit node participates with. In the matrix representation (matrix H of equation (2)) of the code on the left side of FIG. 1 an edge connecting bit node i with check node j is depicted by a non-zero matrix element at the intersection of row j and column i.
Next to the first and last check nodes of FIG. 1 are shown the equivalent rows of equation (1). The symbol “⊕” means “XOR”.
A node degree is the number of edges emanating from the node. A variable node degree is equal to the number of 1's in the corresponding column of H (it is also called a column degree). A check node degree is equal to the number of 1's in the corresponding row of H (it is also called the row degree). We denote by dv the average variable nodes degree (or the average number of 1's in a column) and by dc the average check nodes degree (or the average number of 1's in a row).
LDPC codes can be decoded using iterative message passing decoding algorithms. These algorithms operate by exchanging messages between bit nodes and check nodes along the edges of the underlying bipartite graph that represents the code.
The decoder is provided with initial estimates of the codeword bits (based on the communication channel output or based on the read memory content). These initial estimates are refined and improved by imposing the parity-check constraints that the bits should satisfy as a valid codeword (according to equation (2)). This is done by exchanging information between the bit nodes representing the codeword bits and the check nodes representing parity-check constraints on the codeword bits, using the messages that are passed along the graph edges.
In iterative decoding algorithms, it is common to utilize “soft” bit estimations, which convey both the bit estimations and the reliabilities of the bit estimations.
The bit estimations conveyed by the messages passed along the graph edges can be expressed in various forms. A common measure for expressing a “soft” estimation of a bit v is as a Log-Likelihood Ratio (LLR)
      log    ⁢                  Pr        ⁡                  (                      v            =                          0              |                              current                ⁢                                                                  ⁢                constraints                ⁢                                                                  ⁢                and                ⁢                                                                  ⁢                observations                                              )                            Pr        ⁡                  (                      v            =                          1              |                              current                ⁢                                                                  ⁢                constraints                ⁢                                                                  ⁢                and                ⁢                                                                  ⁢                observations                                              )                      ,where the “current constraints and observations” are the various parity-check constraints taken into account in computing the message at hand and observations, such as the sequence of symbols received from a communication channel, corresponding to the bits participating in these parity checks. The sign of the LLR provides the bit estimation (i.e., positive LLR corresponds to v=0 and negative LLR corresponds to v=1). The magnitude of the LLR provides the reliability of the estimation (i.e., |LLR|=0 means that the estimation is completely unreliable and |LLR|=±∞ means that the estimation is completely reliable and the bit value is known).
The standard method for encoding linear block codes is based on the code's generator matrix Γ, which is composed of a set of basis vectors spanning the code's linear subspace. The code's generator matrix Γ is related to its parity-check matrix H through the following equation:ΓHT=0  (3)Hence, knowing one matrix determines the other. Note that for LDPC codes even though the parity-check matrix H is sparse the generator matrix Γ is not sparse (i.e. Γ has around 50% non zero elements). As noted above, encoding a sequence b of information bits into a codeword c is done as shown in equation (1). If Γ is K×N and not sparse, the complexity of this encoding procedure is ˜K/2×N=O(N2), which is quite high. Moreover the storage complexity of the matrix Γ is ˜K/2×N=O(N2). Fortunately, for LDPC codes a much simpler encoding procedure can be used by taking advantage of the sparse nature of the code's parity-check matrix. Indeed, for LDPC codes encoding is performed based on equation (2) and not based on equation (1). Assume that the code is systematic, i.e. that the first K bits in the codeword c are equal to the information bit sequence b and that the last M bits are the redundant parity bits, denoted as p. Then, the encoding procedure reduces to finding the bit sequence p such that the following equations hold:
                    Hc        =                              H            ·                          [                                                                    b                                                                                        p                                                              ]                                =          0                                    (        4        )            
This problem is easy to solve if we limit the last ‘M’ columns of the parity-cheek matrix H to be a lower triangular matrix as shown in FIG. 2. Based on this matrix structure parity bits can be recovered one after another by applying a Gaussian elimination procedure on the set of equations described by equation (4). The procedure is based on the following observation: whenever we have a parity-check in which only a single bit is unknown, then this bit can be recovered as a XOR of the rest of the bits in the parity-check. Hence, due to the lower triangular form of the parity-check matrix, if we pass over its parity-checks, one by one from top to bottom, then we can recover all of the parity bits. Each parity bit is computed by XORing an average of dc−1 already known bits. Hence, the encoding complexity is reduced to M·(dc−1)=N·(1−R)·(dc−1)=O(N). Note that dc is a small constant (independent of ‘N’) due to the fact that the parity-check matrix is sparse. Furthermore, since encoding is performed using the sparse parity-check matrix H, there is no need to store the dense generator matrix Γ.
A basic property of LDPC codes (and error correction codes in general) is that the error correction capability for the same code rate improves as the code length N increases. Moreover, the error floor effect in LDPC codes reduces as the code length increases. Unfortunately, the code's Encoder/Decoder complexity is proportional to the number of edges |E| in the bipartite graph representing the code, which is proportional to the code length |E|=dv·N.
It would be highly advantageous to have a code structure amenable to efficient, low-complexity and low-power encoding and decoding algorithms. Such a structure would allow for approaching the theoretical capacity limit of a memory such as a flash memory using a low complexity controller.
Definitions
The methodology described herein is applicable to encoding and decoding in at least two different circumstances. One circumstance is the storing of data in a storage medium, followed by the retrieving of the data from the storage medium. The other circumstance is the transmitting of data to a transmission medium, followed by the receiving of the data from the transmission medium. Therefore, the concepts of “storing” and “transmitting” of data are generalized herein to the concept of “exporting” data. Both “storing” data and “transmitting” data thus are special cases of “exporting” data.
The usual instantiation of a storage medium is as a memory such as a flash memory. The usual instantiation of a transmission medium is as a communication channel.
“Encoding” is understood herein to mean turning an information vector (“b” in equation (1)) into a codeword (“c” in equation (1)).