1. Field of the Invention
The present invention relates generally to a Low-Density Parity Check (LDPC) decoder, and in particular, to an efficient LDPC decoding apparatus and method based on a node memory structure.
2. Description of the Related Art
The LDPC code, first provided by Gallager and restudied by MacKay, has recently attracted attention as a code that can show superior performance approaching the Shannon's capacity limit by belief propagation decoding based on a sum-product algorithm.
Thereafter, Richardson et al. proposed a density evolution technique that pursues a change in probability distribution of messages generated and updated during a decoding process in a factor graph constituting an LDPC code. In addition, for infinite iteration in a cycle-free factor graph, Richardson et al. proposed degree distribution of variable and check nodes in a factor graph, which can maximize a channel parameter (for example, threshold) capable of making error probability converge at 0. Further, Richardson et al. have theoretically shown that such a case can be applied even to a finite-length LDPC code having cycles. Richardson et al. also have shown that theoretical channel capacity of an irregular LDPC code can approach the Shannon's capacity limit by up to 0.0045 dB with the use of the density evolution technique.
The LDPC code may increase in complexity of a coding/decoding process, because its performance increases as a packet size increases. Recently, Flarion Company has proposed a multi-edge type vector LDPC code that can be implemented at low hardware complexity even for a large packet size. The vector LDPC code, for example, Block LDPC (BLDPC) code, is a code for which an efficient coder/decoder can be implemented through parallel implementation of a vectorized structure even with a small-sized base H matrix, for example, a check matrix. The parallel implementation due to parallel factors of the BLDPC code is being discussed as a possible alternative to the base codes in the next generation mobile communication system in which a data rate increases sharply, because the BLDPC code, compared with other codes, for example, turbo code and convolutional code, enables implementation of a higher-throughput decoder.
The LDPC code is a special case of the general linear block codes, and is defined as a parity check matrix.
FIG. 1 is a diagram illustrating a Tanner graph and a parity check matrix of a general LDPC code. In FIG. 1, there are shown a Tanner graph and a parity check matrix H for 2 information bits and 3 parity bits.
Referring to FIG. 1, in the Tanner graph, variable nodes indicate bits of a codeword, and check nodes indicate one check equation. Therefore, the number N of the variable nodes indicates a length of the codeword, i.e. a column size of the H matrix, and the number R of the check nodes indicates the number of parity bits, i.e. a row size of the H matrix. Therefore, shown in FIG. 1 is an example of a rate=⅖ LDPC code. An edge for connecting the nodes in the Tanner graph is indicated by ‘1’ in the H matrix.
FIG. 2 is a diagram illustrating an operation of a general message-passing decoder.
Referring to FIG. 2, a decoder for a general LDPC code can be described with a message-passing concept in a bipartite graph, and the decoder operates in an iterative calculation method through exchange of messages, i.e. extrinsic information, between variable nodes and check nodes as shown in FIG. 2.
As shown in FIG. 2, mj,iVC denotes a message delivered from a jth variable node to an ith check node, and is generated through an operation of a variable node processor. In addition, mi,jVC denotes a message delivered from an ith check node to a jth variable node, and is generated through an operation of a check node processor.
In the known sum-product algorithm, generation of each message can be expressed as set forth in Equation (1) and Equation (2):
                                          m                          j              ,              i                        VC                    =                                    y              j                        +                          (                                                ∑                                                            k                      ∈                                              C                        ⁡                                                  (                          j                          )                                                                                                            k                      ≠                      j                                                                      ⁢                                  m                                      k                    ,                    j                                    VC                                            )                                      ,                                  ⁢                              p                          i              ,              j                        CV                    =                                    ∏                                                k                  ∈                                      V                    ⁡                                          (                      i                      )                                                                      ,                                  k                  ≠                  i                                                      ⁢                                                  ⁢                          F              ⁡                              (                                  p                                      k                    ,                    i                                    VC                                )                                                                        (        1        )                                                      q                          i              ,              j                        CV                    =                                    F                              -                1                                      (                                          ∑                                                      k                    ∈                                          V                      ⁡                                              (                        i                        )                                                                              ,                                      k                    ≠                    i                                                              ⁢                              F                ⁡                                  (                                      q                                          k                      ,                      i                                        VC                                    )                                                      )                          ,                            (        2        )            
Equation (1) indicates an expression for message generation of a variable node processor, and Equation (2) indicates an expression for message generation of a check node processor.
As shown in Equation (1), yj denotes a Log-Likelihood Ratio (LLR) corresponding to a jth variable node, i.e. jth received bit, C(j) denotes an index set of check nodes connected to a jth variable node, and V(i) denotes an index set of variable nodes connected to an ith check node.
As shown in Equation (2), p denotes a sign, for example, +1 or −1, of a message m, and q denotes amplitude of a message m, F(x)=−log(tan h(x/2)) for x>=0, and F(x)=+log(tan h(−x/2)) for x<0.
The general LDPC decoder operating with the message-passing algorithm decodes information bits through the iterative calculation process.
The block LDPC code can be found through expansion of the base H matrix, and a description thereof will be made below with reference to FIG. 3.
FIG. 3 is a diagram illustrating a graph and a parity check matrix of a general block LDPC code.
Referring to FIG. 3, there are shown a graph and a parity check matrix of an H matrix obtained by expanding the H matrix used in FIG. 1 by the number, Z=2, of parallel factors. For Z=2, a length of the codeword is expended twice in FIG. 3. In this case, the expanded H matrix can be expressed as a matrix in which each element in the existing base H matrix is a Z×Z matrix.
As shown in the left-hand side of FIG. 3, when the H matrix shown in FIG. 1 is simply expanded, an identity matrix (I) is located in a position of ‘1’. In this case, however, because there is no information exchange between layers in the graph, it is not possible to obtain a coding gain corresponding to an increase in the codeword length. Therefore, as shown in the right-hand side of FIG. 3, connection of the edges is arbitrarily made between layers through edge permutation. This is equivalent to using a permutation matrix instead of the identity matrix in the place corresponding to the position of ‘1’ in the base H matrix. As described above, in the block LDPC, a decoder can increase the throughput because it can perform parallel processing in units of Z, thereby contributing to a reduction in the control overhead.
Meanwhile, when exchanging node messages, the decoder generates V2C messages of edges, i.e. outgoing messages of variable nodes, and generates C2V messages, i.e. outgoing messages of check nodes.
A general message-passing decoder can operate regardless of an update order of edge messages. A scheme of determining the update order of edge messages is called a scheduling scheme, and this affects a convergence rate based on iteration of the decoder.
The generally used scheduling scheme is classified into a flooding scheduling scheme and a serial scheduling scheme. The flooding scheduling scheme updates all C2V messages after updating all V2C messages, and the serial scheduling scheme performs a node-based operation.
In addition, the serial scheduling scheme can be divided into a variable node-based serial scheduling scheme and a check node-based serial scheduling scheme. The check node-based serial scheduling scheme first updates incoming messages (i.e. V2C messages) necessary for updating outgoing messages (i.e. C2V messages) connected to one check node. After updating all of the outgoing messages connected to one check node, the check node-based serial scheduling scheme shifts to the next check node and then performs the update, repeating this operation. The variable node-based serial scheduling scheme can also perform the similar operation.
FIG. 4 is a diagram illustrating a structure of a conventional decoder.
Referring to FIG. 4, there is shown a conventional decoder implemented with one Vector Node Processor (VNP) and one Vector Edge Message Memory (VMM).
The VNP includes at least one node processor unit, and each node processor unit performs Z calculations. In addition, the VNP can perform variable node or check node calculations according to a control signal. The decoder should be able to store both of V2C and C2V messages with one VMM. Therefore, only the flooding scheduling scheme can be applied to this structure.
For example, when the VNP first starts an operation as a variable node processor, C2V messages have been stored in the VMM. Therefore, in order to update outgoing messages (i.e. V2C messages) for first variable nodes, the VNP reads C2V messages stored in the VMM, calculates outgoing messages, and then stores the V2C messages in a memory address on which the read operation was performed.
After the calculation on all the variable nodes is performed, all messages in the VMM are updated as the V2C messages. Thereafter, the VNP performs an operation of a check node processor according to a control signal. That is, in order to store and process both of the V2C and C2V messages with one memory per edge, only the flooding scheduling technique is possible.
In addition, a switching device existing between the VNP and the VMM performs, for example, edge permutation between layers as described in FIG. 3.
In this case, the conventional decoder, as it performs calculations sequentially with one VNP, has a small memory size and simple hardware, but it has low decoder throughput. A detailed description thereof will be made later with reference to FIGS. 6A and 6B.
FIG. 5 is a diagram illustrating another structure of a conventional decoder. In particular, FIG. 5 shows the structure of a decoder with 2 VNPs.
As illustrated in FIG. 5, each VNP serves as only one of a variable node processor and a check node processor, and the conventional decoder can improve decoding throughput as the two processors simultaneously operate. Similarly, the switching device existing between the VNP and the VMM performs, for example, edge permutation between layers as described in FIG. 3. However, the decoder having the structure as shown in FIG. 5 needs separate memories for storing the C2V messages and the V2C messages. Therefore, the memory is doubled compared with that of the structure show in FIG. 4, i.e. the hardware is doubled, causing an increase in the physical size.
Therefore, there is a need for a decoder capable of achieving higher throughput with lower complexity, and for a design and implementation method thereof.