Low-density parity-check (LDPC) codes are a family of linear codes characterized by a sparse parity-check matrix (PCM). They are usually decoded by means of a sum-product algorithm. Nowadays, LDPC codes belong to the most efficient known classes of error-correcting codes and find an ever-growing number of applications.
F. R. Kschischang et al. offer a unified presentation of the sum-product algorithm in the paper “Factor graphs and the sum-product algorithm,” published in 2001 in the IEEE Transactions on Information Theory, vol. 47, no. 2, pages 498-519. A LDPC code can be represented by a factor graph whose factor nodes and variable nodes correspond to the check equations and to the code variables, respectively. Decoding can be realized by passing iteratively sum-product messages along the edges of the factor graph.
The sum-product algorithm usually follows a two-phase scheduling. Within each iteration, first all messages from the variable nodes to the check (i.e. factor) nodes (the “variable-to-check” messages), and then all messages from the check nodes to the variable nodes (the “check-to-variable” messages) are computed and propagated.
Engling Yeo et al., in the paper “High throughput low-density parity-check decoder architectures,” published in 2001 in the Proceedings of the Global Telecommunications Conference, 2001 (GLOBECOM '01), vol. 5, pages 3019-3024, introduce an alternative scheduling called “staggered scheduling”. According to this approach, check nodes are gathered in several groups. The nodes belonging to the same group are processed simultaneously, whereas the different node groups are processed sequentially. As a consequence, the intermediate updates obtained from each group are available to the subsequent groups already within the same iteration. Not only leads this scheduling to an improved performance, but it also requires less storage.
M. Mansour and N. R. Shanbhag, in the paper “High-throughput LDPC decoders,” published in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 11, no. 6, pages 976-996, 2003, present the turbo-decoding message-passing (TDMP) algorithm, which relies upon the same scheduling.
D. E. Hocevar introduces, in the paper “A reduced complexity decoder architecture via layered decoding of LDPC codes,” published in the Proceedings of IEEE Workshop on Signal Processing Systems, 2004 (SIPS 2004), pages 107-112, a “layered decoding”, which exploits similar ideas.
Layered decoding, or TDMP, or staggered scheduling, perform best when the check equations that are processed in parallel are mutually independent. “Layered LDPC codes”, in the sense of the present disclosure, may be defined by a PCM with block structure whose sub-matrices are permutation matrices of uniform size p×p. In the following, we call each group of p subsequent rows a “block row”, and each group of p subsequent columns a “block column”.
For illustration, FIG. 1 shows the PCM of an exemplary layered LDPC code, where Πi,j denotes the permutation sub-matrix at the intersection of the i-th block row and the j-th block column, and the empty entries correspond to zero sub-matrices of size p×p. Since permutation matrices have a constant row and column weight of one, each “block row”, corresponds to mutually independent equations, i.e., equations involving disjoint sets of variables. Consequently, layered LDPC codes are well-suited for layered decoding with a maximum degree of parallelism of p check nodes.
Decoding of binary LDPC codes is usually implemented in log-likelihood ratio (LLR) arithmetic. The decoder is provided with channel LLRs, which express the probability that each bit is 0 or 1. For a memory-less channel and identically distributed bits, the LLRs can be defined as
                                                        Λ              i                        ⁢                        ⁢            log            ⁢                                          Prob                ⁢                                  {                                                            b                      i                                        =                                          0                      |                      y                                                        }                                                            Prob                ⁢                                  {                                                            b                      i                                        =                                          1                      |                      y                                                        }                                                              =                      log            ⁢                                          Prob                ⁢                                  {                                                            y                      |                                              b                        i                                                              =                    0                                    }                                                            Prob                ⁢                                  {                                                            y                      |                                              b                        i                                                              =                    1                                    }                                                                    ,                            (        1        )            where log denotes the natural logarithm, y is the received signal and bi is the i-th bit. Extracting the sign of a LLR is equivalent to taking a hard decision, whereby positive and negative values correspond to 0 and 1, respectively. For more details on the LLR arithmetic we refer to the paper by J. Hagenauer et al., “Iterative decoding of binary block and convolutional codes,” published in IEEE Transactions on Information Theory, vol. 42, no. 2, pages 429-445, 1996.
The pseudo-code of the layered sum-product decoding algorithm is schematically illustrated in FIG. 2, where                I is the maximum number of decoding iterations,        v is the vector of the “a posteriori” LLRs of all the variables,        c2vi is the vector of all check-to-variable LLR messages at the end of the i-th iteration, and        v2ci is the vector of all variable-to-check LLR messages at the i-th iteration.        
Further, we denote by v[r], c2vi[r] and v2ci[r] the slices of the respective sum-product message vectors involved in the processing of the r-th block row. With reference to FIG. 1, v[0] indicates the variable LLRs involved in the processing of the first block row, i.e., the LLRs of the block columns 0, 1, 2 and 4. In the same way, c2vi[0] and v2c1[0] denote the messages passed at the crossing of the block row o with the block columns 0, 1, 2 and 4. Extracting these slices generally involves some shuffling operation on the messages. Shuffling is therefore implicit in the adopted indexing convention.
Each row of the PCM represents a single-parity check (SPC) code. The SPCdec function used at line 6 of FIG. 2 processes in parallel the p SPC codes corresponding to the r-th block row. For each SPC code this function computes the message
                              c          ⁢                                          ⁢          2          ⁢                                          ⁢                                    v              i                        ⁡                          (              k              )                                      =                                                                        l                ∈                V                                            l                ≠                k                                              ⁢          v          ⁢                                          ⁢          2          ⁢                                          ⁢                                    c              i                        ⁡                          (              l              )                                                          (        2        )            that will be sent from the check node to each k-th variable node in the set V of involved variables. Here v2ci(l) is the current message originating from the l-th variable node of the SPC code, and  denotes the associative and commutative LLR exclusive-or (XOR) operator defined byxysign(x)·sign(y)·[min(|x|,|y|)+fnl(|x|+|y|)−fnl(∥x|−|y∥)]  (3)withfnl(x)log(1+e−x),  (4)which may be approximated well in terms of linear functions.
The algorithm of FIG. 2 encompasses a processing loop over the a posteriori LLRs of the code variables. The values computed at line 7 when processing the r-th block row are required at the (r+1)-th block row to execute line 5. Obviously, the decoding process cannot proceed till the result of the previous block row is available.
In any practical VLSI implementation the loop at lines 4-8 has an inevitable latency dictated by technological constraints. The presence of wait cycles in the scheduling is equivalent to a reduction of the system clock frequency, and therefore leads to a lowering of the throughput and/or of the number of decoding iterations, which are both undesirable effects.
In U.S. Pat. No. 7,174,495 B2 entitled “LDPC decoder, corresponding method, system and computer program,” E. Boutillon et al. introduce a general architecture for the implementation of LDPC decoders. They explain that if the decoder does not honour the wait cycles, the update of the a posteriori variable LLRs suffers from inconsistencies, which they call “cut-edge conflicts”. In our notation, US '495 considers the difference, or “delta”Δi=c2vi−c2vi−c2vi-1.  (5)After being properly shuffled, this delta is added to the a posteriori variable LLRs. However, US '495 requires an additional connection from the storage of the a posteriori variable LLRs to the adder circuit, and hence a second read port.
In a LDPC decoder the width of the RAM ports grows linearly with both the data throughput and the maximum number of supported iterations. For high-speed high-performance decoders the area and the power consumption of the RAM is dominated by the ports rather than the storage. A dual port RAM with one write and two read ports is almost twice as large as a two-port RAM with one write port and one read port. Therefore, the increase of area and power consumption make the techniques of US '495 practically unattractive for the considered class of applications.
A similar architecture has been proposed by M. Rovini et al. in the paper “Layered Decoding of Non-Layered LDPC Codes,” published in the proceedings of the 9th EUROMICRO Conference on Digital System Design (DSD '06), 2006, pages 537-544. In this case the conflicts do not arise because the necessary wait cycles are disregarded, but because the considered LDPC codes are “non-layered”. Check equations that are processed in parallel are not independent, which results in concurrent updates of the same sets of variables. The so-called “delta-based” architecture proposed by the authors relies upon the computation of the increments of the variable LLRs. A dual port memory with one write port and two read ports (rather than a two-port memory with one write port and one read port) is required. Further, all decoder memories must be triggered at twice the system frequency. The same considerations made above regarding the drawbacks of multi-port RAMs apply also to this architecture. In addition, the use of dual clock edge triggered memories makes this solution even less suited for high-speed applications.
More variants of the same idea have been considered to avoid the conflicts that arise when decoding non-layered LDPC codes. S. Müller et al., in the paper “A novel LDPC decoder for DVB-S2 IP,” published in the Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2009 (DATE '09), pages 1308-1313 describe a solution suited for LDPC codes whose parity check matrix contains sub-matrices that are superpositions of two circulant permutation matrices. The architecture contains, in addition to the main storage for the variable LLRs, an “a-posterior register” whose complexity scales unfavourably when the conflicts arise from multiple wait cycles rather than from selected sub-matrices with weight-two rows and columns.
The solutions described above can be collectively called “delta architectures”. We condense the basic idea in the pseudo-code algorithm of FIG. 3. Essentially, the update at line 7 of FIG. 2 is split into the two steps at lines 7 and 8 of FIG. 3
We observe that a systolic implementation of the delta architecture according to FIG. 3 requires simultaneous access to three different check-to-variable messages to compute the variable-to-check messages at line 5 and the delta value at line 7. Prior-art solutions resort to a dual port RAM with one write and two read ports to store the check-to-variable messages. The two read ports are required to access c2vi-1[r] and c2vi[r], and an additional delay line is necessary to make the c2vi-1[r] messages available both for the computations at line 5 and a line 7. Once again, dual port RAMs with one write and two read ports represent a critical drawback. Additionally, the delay line consumes a significant amount of area and power in a VLSI implementation.
A further important aspect, which has repercussions on the whole realization of a LDPC decoder, is the implementation of the LLR-XOR  operator of Eq. (3). Several efficient approximations have been discussed in the prior art. Besides the single core operator, also chains of LLR-XORs have been considered. Here we focus on the simultaneous approximation of the whole chains of LLR-XORs in Eq. (2) for all k∈V, i.e., on the implementation of the SPCDec function.
J. Hagenauer et al., in the paper “Iterative decoding of binary block and convolutional codes,” published in IEEE Transactions on Information Theory, vol. 42, no. 2, pages 429-445, 1996, suggest to approximate the LLR-XOR of J operands as
                                                                                    j                =                1                            ,              2              ,              …              ,              J                                ⁢                      LLR            j                          ≅                              ∏                                          j                =                1                            ,              2              ,              …              ,              J                                                                      ⁢                                          ⁢                                    sign              ⁡                              (                                  LLR                  j                                )                                      ·                                          min                                                      j                    =                                    ,                  1                  ,                  2                  ,                  …                  ,                  J                                            ⁢                                                                                      LLR                    j                                                                    .                                                                        (        6        )            
The modified version of the sum-product algorithm obtained by using this approximation is called min-sum (MS) algorithm.
M. P. C. Fossorier et al., in the paper, “Reduced complexity iterative decoding of low-density parity check codes based on belief propagation,” vol. 47, no. 5, pages 673-680, May 1999, observe that with the MS algorithm, all the check-to-variable messages in Eq. (2) for k e V can be calculated by identifying the two minimum values of the magnitudes of the variable-to-check messages. In equations, given
                                          l            0                    ⁢                    ⁢                                          ⁢          arg          ⁢                                    min                              l                ∈                V                                      ⁢                                                        v                ⁢                                                                  ⁢                2                ⁢                                                      c                    i                                    ⁡                                      (                    l                    )                                                                                                    ⁢                                  ⁢                              m            0                    ⁢                    ⁢                                                v              ⁢                                                          ⁢              2              ⁢                                                c                  i                                ⁡                                  (                                      l                    0                                    )                                                                                                    (        7        )                                                      m            1                    ⁢                    ⁢                                    min                                                l                  ∈                  V                                                  l                  ⁢                                      ∈                    0                                                                        ⁢                                                        v                ⁢                                                                  ⁢                2                ⁢                                                      c                    i                                    ⁡                                      (                    l                    )                                                                                                    ,                            (        8        )            the k-th check-to-variable message in Eq. (2) can be approximated as
                              c          ⁢                                          ⁢          2          ⁢                                    v              i                        ⁡                          (              k              )                                      ≅                  {                                                                                                                ∏                                              l                        ≠                        k                                                                    l                        ∈                        V                                                              ⁢                                                                                  ⁢                                                                  sign                        ⁡                                                  (                                                      v                            ⁢                                                                                                                  ⁢                            2                            ⁢                                                                                          c                                i                                                            ⁡                                                              (                                l                                )                                                                                                              )                                                                    ·                                              m                        0                                                                                                                                  (                                          k                      ≠                                              l                        0                                                              )                                                                                                                                          ∏                                              l                        ≠                        k                                                                    l                        ∈                        V                                                              ⁢                                                                                  ⁢                                                                  sign                        ⁡                                                  (                                                      v                            ⁢                                                                                                                  ⁢                            2                            ⁢                                                                                          c                                i                                                            ⁡                                                              (                                l                                )                                                                                                              )                                                                    ·                                              m                        1                                                                                                                                  (                                          k                      ≠                                              l                        0                                                              )                                                                        .                                              (        9        )            
The MS algorithm results in a simple implementation, but is known to suffer from a significant performance penalty with respect to the exact sum-product algorithm.
F. Guilloud et al., in the paper “λ-min decoding algorithm of regular and irregular LDPC codes,” published in the Proceedings of the 3rd International Symposium on Turbo Codes & Related Topics, 2003, pages 451-454, introduce a more accurate approximation, which uses only the λ lowest magnitudes of the variable-to-check messages and the corresponding indexes l0, l1, . . . , lλ-1 determined as
                                          l            k                    ⁢                    ⁢                                          ⁢          arg          ⁢                                    min                                                l                  ∈                  V                                                  l                  ∉                                      {                                                                  l                        0                                            ,                                              l                        1                                            ,                      …                      ,                                              l                                                  k                          -                          1                                                                                      }                                                                        ⁢                                                        v                ⁢                                                                  ⁢                2                ⁢                                                      c                    i                                    ⁡                                      (                    l                    )                                                                                                    ⁢                                  ⁢                              (                                                            for                  ⁢                                                                          ⁢                  k                                =                0                            ,              1              ,              …              ⁢                                                          ,                              λ                -                1                                      )                    .                                    (        10        )            
According to the λ-min algorithm, the check-to-variable messages in Eq. (2) are approximated as
                                          c            ⁢                                                  ⁢            2            ⁢                                          v                i                            ⁡                              (                k                )                                              ≅                                    [                                                ∏                                      l                    ≠                    k                                                        l                    ∈                    V                                                  ⁢                                                                  ⁢                                  sign                  ⁡                                      (                                          v                      ⁢                                                                                          ⁢                      2                      ⁢                                                                        c                          i                                                ⁡                                                  (                          l                          )                                                                                      )                                                              ]                        ·                          [                                                                                                                                                                                          l                            =                                                                                          l                                0                                                            ⁢                                                              l                                1                                                                                                              ,                          …                          ,                                                      l                                                          λ                              -                              1                                                                                                                                                                                                                    l                          ≠                          k                                                                                                                    ⁢                                                                        v                    ⁢                                                                                  ⁢                    2                    ⁢                                                                  c                        i                                            ⁡                                              (                        l                        )                                                                                                                          ]                                      ,                            (        11        )            where the second factor can take only λ+1 distinct values.
Both the MS and the λ-min algorithm can be expressed in the form
                                          c            ⁢                                                  ⁢            2            ⁢                                          v                i                            ⁡                              (                k                )                                              ≅                                    [                                                ∏                                      l                    ≠                    k                                                        l                    ∈                    V                                                  ⁢                                                                  ⁢                                  sign                  ⁡                                      (                                          v                      ⁢                                                                                          ⁢                      2                      ⁢                                                                        c                          i                                                ⁡                                                  (                          l                          )                                                                                      )                                                              ]                        ·                          m              ⁡                              (                k                )                                                    ,                            (        12        )            where the magnitude m(k)≥0 takes values in a reduced set M with respect to the exact sum-product implementation. The selection of m(k)∈M depends on a certain set L of indices. For the MS algorithmL{l0},  (13)whereas for the λ-min algorithmL{l0,l1, . . . ,lλ-1}.  (14)It has been suggested to improve either algorithm by introducing a corrective multiplicative factor α>0 or an additive offset β>0 or both, as shown in the following generalized expression
                                          c            ⁢                                                  ⁢            2            ⁢                                          v                i                            ⁡                              (                k                )                                              ≅                                    [                                                ∏                                      l                    ≠                    k                                                        l                    ∈                    V                                                  ⁢                                                                  ⁢                                  sign                  ⁡                                      (                                          v                      ⁢                                                                                          ⁢                      2                      ⁢                                                                        c                          i                                                ⁡                                                  (                          l                          )                                                                                      )                                                              ]                        ·            α            ·                          max              ⁡                              (                                                                            M                      ⁡                                              (                        k                        )                                                              -                    β                                    ,                  0                                )                                                    ,                            (        15        )            where the maximum operation guarantees that the offset does not invert the sign of the message. We refer to any algorithm of the type of Eq. (15) as a generalized reduced magnitude-choice (GRMC) algorithm. We notice that, by extension, the exact sum-product algorithm can be regarded as a GRMC algorithm with a number of different message magnitudes equal to the cardinality of V and with α=1 and β=0.
In view of the problems of the prior art described above, what is required is an improved decoding system and method that overcomes the loop-latency problems and achieves high-performance LDPC decoding at very high data rates.