Error correcting codes are ubiquitous in communications and data storage systems. Error correcting codes compensate for the intrinsic unreliability of information transfer in these systems by introducing redundancy into the data stream. Recently considerable interest has grown in a class of codes known as low-density parity-check (LDPC) codes. LDPC codes are provably good codes. On various channels, LDPC codes have been demonstrated to be really close to the channel capacity—the upper limit for transmission established by Claude Shannon.
LDPC codes are often represented by bipartite graphs, called Tanner graphs, in which one set of nodes, the variable nodes, correspond to bits of the codeword and the other set of nodes, the constraint nodes, sometimes called check nodes, correspond to the set of parity-check constraints which define the code. Edges in the graph connect variable nodes to constraint nodes. A variable node and a constraint node are said to be neighbors if they are connected by an edge in the graph.
A bit sequence associated one-to-one with the variable nodes is a codeword of the code if and only if, for each constraint node, the bits neighboring the constraint (via their association with variable nodes) sum to zero modulo two, i.e., they comprise an even number of ones.
An exemplary bipartite graph 100 determining an exemplary (3,6) regular LDPC code of length ten and rate one-half is shown in FIG. 1. Length ten indicates that there are ten variable nodes V1-V10, each identified with one bit of the codeword X1-X10. The set of variable nodes V1-V10 is identified in FIG. 1 by reference numeral 102. Rate one half indicates that there are half as many check nodes as variable nodes, i.e., there are five check nodes C1-C5 identified by reference numeral 106. Rate one half further indicates that the five constraints are linearly independent. Exemplary bipartite graph 100 includes edges 104, wherein the exemplary (3,6) regular LDPC code has 3 edges connected to each variable node and 6 edges connected to each constraint node and at most one edge between any two nodes.
While FIG. 1 illustrates the graph associated with a code of length 10, it can be appreciated that representing the graph for a codeword of length 1000 would be 100 times more complicated.
An alternative to the Tanner graph representation of LDPC codes is the parity check matrix representation such as that shown in drawing 200 of FIG. 2. In this representation of a code, the matrix H 202, commonly referred to as the parity check matrix, includes the relevant edge connection, variable node and constraint node information. In the matrix H 202, each column corresponds to one of the variable nodes while each row corresponds to one of the constraint nodes. Since there are 10 variable nodes and 5 constraint nodes in the exemplary code, the matrix H 202 includes 10 columns and 5 rows. The entry of the matrix 202 corresponding to a particular variable node and a particular constraint node is set to 1 if an edge is present in the graph, i.e., if the two nodes are neighbors, otherwise it is set to 0. For example, since variable node V1 is connected to constraint node C1 by an edge, a one is located in the uppermost left-hand corner of the matrix 202. However, variable node V5 is not connected to constraint node C1 so a 0 is positioned in the fifth position of the first row of matrix 202 indicating that the corresponding variable and constraint nodes are not connected. We say that the constraints are linearly independent if the rows of H 202 are linearly independent vectors over GF[2], where GF[2] is the binary Galois Field.
In the case of a matrix representation, the codeword X which is to be transmitted can be represented as a vector 204 which includes the bits X1-Xn of the codeword to be processed. A bit sequence X1-Xn is a codeword if and only if the product of the matrix 202 and matrix 204 is equal to zero, that is: HX=0.
Encoding LDPC codes refers to the procedure that produces a codeword from a set of information bits. By preprocessing the LDPC graph representation or the matrix representation, the set of variable nodes corresponding information bits can be determined prior to actual encoding.
To build an encoder for a general LDPC code, the first step is to find a permutation of the rows and columns of H so that, up to reordering, we can divide the m×n matrix H into the following sub-matrices
  H  =      [                            T                          A                          B                                      E                          C                          D                      ]  where T is a t×t upper triangular sub-matrix, i.e. all entries below the main diagonal are zero, E is a g×t submatrix, A is t×g, C is g×g, B is t×(n−m), D is g×(n−m) and t+g=m. Moreover the g×g matrix φ:=ET−1A+C is invertible (we assume here that H is full row rank.)Encoding then proceeds as follows. To encode codeword x=[xpi xp2 xs] given information bits xS, we first solve[T A B][y 0 xs]T=0for y using back-substitution. Next we solveφxp2=[E C D][y 0 xs]T for xp2. For this step the matrix φ1 is pre-computed. Finally, one solves[T A B][xp1 xp2 xs]T=0for xp1 using back substitution. The vector [xp1 xp2 xs]T constitutes the codeword.
While encoding efficiency and high data rates are important, for an encoding system to be practical for use in a wide range of devices, e.g., consumer devices, it is important that an encoder be capable of being implemented at reasonable cost. Accordingly, the ability to efficiently implement encoding schemes used for error correction and/or detection purposes, e.g., in terms of hardware costs, can be an important consideration.
In view of the above discussion it should be appreciated that there is a need for encoder apparatus and methods directed to efficient architecture structures for implementing LDPC codes. Apparatus and methods that allow the reuse of the same hardware to encode codewords of different lengths would be beneficial and desirable. This is because it would allow for greater flexibility during encoder use and allow different sets of data to be encoded using codewords of different sizes thereby allowing the codeword size to be selected for a particular encoding application, e.g., communications session or data storage application, without the need for multiple encoders to support such flexibility.