In recent years, research in, for example, communication fields such as mobile communication and deep space communication, and broadcasting fields such as terrestrial-wave or satellite digital broadcasts has progressed remarkably. Along with this situation, research on coding theories for making error correction coding and decoding efficient has been actively carried out.
As a theoretical limit of code performance, the Shannon limit implied by the so-called Shannon's (C. E. Shannon) channel coding theorem is known. Research on coding theories has been carried out for the purpose of developing codes exhibiting performance near this Shannon limit. In recent years, as a coding method exhibiting performance near the Shannon limit, for example, techniques for what is commonly called “turbo coding”, such as parallel concatenated convolutional codes (PCCC) and serially concatenated convolutional codes (SCCC), have been developed. Furthermore, whereas this turbo coding has been developed, low density parity check codes (hereinafter referred to as “LDPC codes”), which is a coding method that has been known for a long time, have attracted attention.
LDPC codes were proposed first in R. G. Gallager, “Low Density Parity Check Codes”, Cambridge, Mass.: M. I. T. Press, 1963. Thereafter, LDPC codes reattracted attention in D. J. C. MacKay, “Good error correcting codes based on very sparse matrices”, submitted to IEEE Trans. Inf. Theory, IT-45, pp. 399-431, 1999, and M. G. Luby, M. Mitzenmacher, M. A. Shokrollahi and D. A. Spielman, “Analysis of low density codes and improved designs using irregular graphs”, in Proceedings of ACM Symposium on Theory of Computing, pp. 249-258, 1998.
It is beginning to be known from this recent research that, for the LDPC codes, as the code length increases, performance close to the Shannon limit can be obtained, similarly to turbo coding. Furthermore, since the LDPC codes have the property that the minimum length is proportional to the code length, they have the advantages that the block error probability characteristics are good, and a so-called error floor phenomenon, which is observed in decoding characteristics of turbo coding, hardly occurs.
Such LDPC codes will now be described in detail below. The LDPC codes are linear codes and do not always need to be two-dimensional, but here, a description is given assuming that the LDPC codes are two-dimensional.
The greatest features of the LDPC codes are that the parity check matrix that defines the LDPC codes are sparse. Here, a sparse matrix is formed in such a manner that the number of 1s in the elements of the matrix is very small. If the sparse check matrix is denoted as H, examples thereof include a check matrix in which, as shown in FIG. 1, the Hamming weight of each column (number of 1s; weight) is “3”, and the Hamming weight of each row is “6”.
As described above, the LDPC codes defined by the check matrix H in which the Hamming weight of each row and each column is fixed are called “regular LDPC codes”. On the other hand, the LDPC codes defined by a check matrix H in which the Hamming weight of each row and each column is not fixed are called “irregular LDPC codes”.
Coding by such LDPC codes is realized by generating a generation matrix G on the basis of the check matrix H and by generating a code word by multiplying this generation matrix G by a two-dimensional information message. More specifically, a coding apparatus for performing coding by LDPC codes computes a generation matrix G in which the equation GHT=0 holds with a transpose matrix HT of the check matrix H. Here, when the generation matrix G is a k×n matrix, the coding apparatus multiplies the generation matrix G by a k-bit information message (vector u), and generates an n-bit codeword c (=uG). The codeword generated by this coding apparatus is transmitted with the code bit whose value is “0” being mapped to “+1” and the code bit whose value is “1” being mapped to “−1”, and is received at the reception side via a predetermined communication channel.
On the other hand, decoding of the LDPC codes can be performed by a message passing algorithm by belief propagation on a so-called Tanner graph, which is formed of a variable node (also called a message node) and a check node; this message passing algorithm was proposed by Gallager and is known as “probabilistic decoding”. Hereafter, the variable nodes and the check nodes are also referred to simply as nodes where appropriate.
However, in probabilistic decoding, since messages exchanged between nodes are real-number values, in order to find an analytical solution, it is necessary to trace the probability distribution of the message that takes a continuous value. This necessitates analysis involving a large degree of difficulty. Accordingly, Gallager has proposed an algorithm A or an algorithm B as an algorithm for decoding LDPC codes.
In general, decoding of the LDPC codes is performed in accordance with the procedure shown in FIG. 2. Here, the receiving value is denoted as U0 (u0i), the message output from the check node is denoted as uj, and the message output from the variable node is denoted as vi. Here, the message is a real-number value such that the “0”-likeness of the value is represented by a so-called log likelihood ratio.
In the decoding of the LDPC codes, initially, as shown in FIG. 2, in step S11, the receiving value U0 (u0i) is received, the message uj is initialized to 0, and a variable k that takes an integer as a counter for an iterative process is initialized to 0. The process then proceeds to step S12. In step S12, based on the received value U0 (u0i), a message vi is determined by performing a computation shown in equation (1). Furthermore, based on this message vi, a message uj is determined by performing a computation shown in equation (2).
                              v          i                =                              u                          0              ⁢              i                                +                                    ∑                              j                =                1                                                              d                  v                                -                1                                      ⁢                          u              j                                                          (        1        )                                          tanh          ⁡                      (                                          u                j                            2                        )                          =                              ∏                          i              =              1                                                      d                c                            -              1                                ⁢                      tanh            ⁡                          (                                                v                  i                                2                            )                                                          (        2        )            
Here, dv and dc in equations (1) and (2) are parameters respectively that indicate the number of 1s in the vertical direction (in the row direction) and in the horizontal direction (in the column direction) of the check matrix H and that can be selected as desired. For example, in the case of a (3, 6) code, dv=3 and dc=6.
In the computation of each of equations (1) and (2), since the message input from an edge from which a message is to be output is not used as a parameter for a sum or product computation, the range of the sum or product computation is from 1 to dv−1 or 1 to dc−1. In practice, the computation shown in equation (2) is performed by creating in advance a table of a function R(v1, v2), shown in equation (3), that is defined by one output with respect to two inputs v1 and v2 and by using this table continuously (recursively), as shown in equation (4).x=2 tan h−1{ tan h(v1/2)tan h(v2/2)}=R(v1, v2)  (3)uj=R(v1, R(v2, R(v3, . . . R(vdc−2, vdc−1))))  (4)
In step S12, furthermore, the variable k is incremented by 1, and the process then proceeds to step S13. In step S13, it is determined whether or not the variable k is greater than or equal to a predetermined number N of iterative decodings. When it is determined in step S13 that the variable k is not greater than or equal to N, the process returns to step S12, and the identical processing is performed again.
When it is determined in step S13 that the variable k is greater than or equal to N, the process proceeds to step S14, where the message v serving as the decoded result, which is finally output as a result of performing the computation shown in equation (5), is determined and output. This completes the decoding process of the LDPC codes.
                              v          i                =                              u                          0              ⁢              i                                +                                    ∑                              j                =                1                                            d                v                                      ⁢                          u              j                                                          (        5        )            
Here, unlike the computation of equation (1), the computation of equation (5) is performed using the input messages from all the edges connected to the variable nodes.
In such LDPC code decoding, for example, in the case of (3, 6) code, as shown in FIG. 3, messages are exchanged between nodes. In the node (variable node) indicated by “=” in FIG. 3, the computation shown in equation (1) is performed. In the node indicated by “+” (check node), the computation shown in equation (2) is performed. In particular, in the algorithm A, the message is formed to be two-dimensional; in the node indicated by “+”, an exclusive OR computation of dc−1 input messages is performed; and in the node indicated by “=”, with respect to the received value R, when all the dv−1 input messages are different bit values, the sign is inverted and output.
Furthermore, in recent years, research on an implementation method of the decoding of LDPC codes has been carried out. Before describing the implementation method, the decoding of LDPC codes is described in a schematic form.
FIG. 4 shows an example of a parity check matrix of (3,6) LDPC codes (a coding rate of ½, a code length of 12). The parity check matrix of LDPC codes can be written by using a Tanner graph, as shown in FIG. 5. In FIG. 5, nodes indicated by “+” are check nodes, and nodes indicated by “=” are variable nodes. The check nodes and the variable nodes correspond to the rows and the columns of the parity check matrix, respectively. The connecting line between the check node and the variable node is an edge and corresponds to “1” of the check matrix. That is, when the element of the j-th row and the i-th column of the check matrix is 1, in FIG. 5, the i-th variable node (node of “=”) from the top and the j-th check node (node of “+”) from the top are connected to each other by an edge. The edge indicates that the sign bit corresponding to the variable node has a constraint condition corresponding to the check node. FIG. 5 shows a Tanner graph of the check matrix of FIG. 4.
In the sum product algorithm, which is a method of decoding LDPC codes, the computation of the variable node and the computation of the check node are repeatedly performed.
In the variable node, as shown in FIG. 6, the computation of equation (1) is performed. That is, in FIG. 6, the message vi corresponding to the edge to be calculated is calculated by using the messages u1 and u2 from the remaining edges connected to the variable node, and the received information u0i. The messages corresponding to the other edges are also calculated similarly.
Before describing the check node computation, equation (2) is rewritten as shown in equation (6) by using the equation a×b=exp{ln(|a|)+ln(|b|)}× sign (a)×sign (b), where sign (x) is 1 when x≧0 and is −1 when x<0.
                                                                        u                j                            =                            ⁢                              2                ⁢                                                      tanh                                          -                      1                                                        ⁡                                      (                                                                  ∏                                                  i                          =                          1                                                                                                      d                            c                                                    -                          1                                                                    ⁢                                              tanh                        ⁢                                                  (                                                                                    v                              i                                                        2                                                    )                                                                                      )                                                                                                                          =                            ⁢                              2                ⁢                                                      tanh                                          -                      1                                                        ⁡                                      [                                          exp                      ⁢                                              {                                                                              ∑                                                          i                              =                              1                                                                                                                      d                                c                                                            -                              1                                                                                ⁢                                                      ln                            ⁡                                                          (                                                                                                                                tanh                                  ⁢                                                                      (                                                                                                                  v                                        i                                                                            2                                                                        )                                                                                                                                                              )                                                                                                      }                                            ×                                                                        ∏                                                      i                            =                            1                                                                                                              d                              c                                                        -                            1                                                                          ⁢                                                  sign                          ⁡                                                      (                                                          tanh                              ⁡                                                              (                                                                                                      v                                    i                                                                    2                                                                )                                                                                      )                                                                                                                ]                                                                                                                          =                            ⁢                              2                ⁢                                                      tanh                                          -                      1                                                        ⁡                                      [                                          exp                      ⁢                                              {                                                  -                                                      (                                                                                          ∑                                                                  i                                  =                                  1                                                                                                                                      d                                    c                                                                    -                                  1                                                                                            ⁢                                                              -                                                                  ln                                  ⁡                                                                      (                                                                          tanh                                      ⁡                                                                              (                                                                                                                                                                                                                        v                                              i                                                                                                                                                                            2                                                                                )                                                                                                              )                                                                                                                                                        )                                                                          }                                                              ]                                                  ×                                                      ∏                                          i                      =                      1                                                                                      d                        c                                            -                      1                                                        ⁢                                      sign                    (                                          v                      i                                        )                                                                                                          (        6        )            
Furthermore, in the case of x≧0, when the definition φ(x)=ln(tan h(x/2)) is made, since φ−1(x)=2 tan h−1(e−x), equation (6) can be written as equation (7).
                              u          j                =                                            ϕ                              -                1                                      ⁡                          (                                                ∑                                      i                    =                    1                                                                              d                      c                                        -                    1                                                  ⁢                                  ϕ                  ⁡                                      (                                                                                        v                        i                                                                                    )                                                              )                                ×                                    ∏                              i                =                1                                                              d                  c                                -                1                                      ⁢                          sign              ⁡                              (                                  v                  i                                )                                                                        (        7        )            
In the check node, as shown in FIG. 7, the computation of equation (7) is performed. That is, in FIG. 7, the message uj corresponding to the edge for which a calculation is to be performed is calculated by using the messages v1, v2, V3, V4, and v5 from the remaining edges connected to the check node. The messages corresponding to the other edges are also calculated similarly.
The function φ(x) can also be expressed as φ(x)=ln((ex+1)/(ex−1)) and when x>0, φ(x)=φ−1(x). When the functions φ(x) and φ−1(x) are implemented as hardware, there are cases in which they are implemented using an LUT (Look-Up Table), and both of them are the same LUT.
When the sum product algorithm is implemented as hardware, it is necessary to repeatedly perform the variable node computation expressed by equation (1) and the check node computation expressed by equation (7) with an appropriate circuit scale and at an appropriate operating frequency.
As an example of the implementation of the decoding apparatus, a description is given first of an implementation method in a case where decoding is performed by simply performing the computation of each node one-by-one in sequence (full serial decoding).
It is assumed here that, for example, codes (a coding rate of ⅔, and a code length of 90) represented by a 30 (rows)×90 (columns) check matrix of FIG. 8 are decoded. The number of 1s of the check matrix of FIG. 8 is 269; therefore, in the Tanner graph, the number of edges becomes 269. Here, in the check matrix of FIG. 8, 0 is represented by “.”.
FIG. 9 shows an example of the configuration of a decoding apparatus for decoding LDPC codes once.
In the decoding apparatus of FIG. 9, a message corresponding to one edge is calculated for each clock at which it operates.
More specifically, the decoding apparatus of FIG. 9 includes two memories 100 and 102 for edges, one check node calculator 101, and one variable node calculator 103, one memory 104 for reception, and one control section 105.
In the decoding apparatus of FIG. 9, message data is read one-by-one from the memory 100 or 102 for edges, and by using the message data, the message data corresponding to the desired edge is calculated. Then, the message data determined by that calculation is stored one-by-one in the memory 100 or 102 for edges at a subsequent stage. When iterative decoding is to be performed, the iterative decoding is realized by serially concatenating a plurality of the decoding apparatuses of FIG. 9 for decoding LDPC codes once or by repeatedly using the decoding apparatus of FIG. 9. Here, it is assumed that, for example, a plurality of the decoding apparatuses of FIG. 9 are connected.
The memory 100 for edges stores messages D100 supplied from the variable node calculator 103 of the decoding apparatus (not shown) at a previous stage in the order in which the check node calculator 101 at a subsequent stage reads them. Then, at the phase of the check node calculation, the memory 100 for edges supplies, to the check node calculator 101, the messages D100 as a message output D101 in the order in which they are stored.
Based on the control signal D106 supplied from the control section 105, the check node calculator 101 performs a computation in accordance with equation (7) by using the message D101 supplied from the memory 100 for edges, and supplies a message D102 determined by that computation to the memory 102 for edges at a subsequent stage.
The memory 102 for edges stores the messages D102 supplied from the check node calculator 101 at a previous stage in the order in which the variable node calculator 103 at a subsequent stage reads them. Then, at the phase of the variable node calculation, the memory 102 for edges supplies the message D102 as a message D103 to the variable node calculator 103 in the order in which they are stored.
Furthermore, a control signal D107 is supplied to the variable node calculator 103 from the control section 105, and received data D104 is supplied thereto from the memory 104 for reception. Based on a control signal D107, the variable node calculator 103 performs a computation in accordance with equation (1) by using the message D103 supplied from the memory 100 for edges and the received data D104 supplied from the memory 100 for reception, and supplies a message D105 obtained as a result of the computation to the memory 100 for edges, of the decoding apparatus (not shown) at a subsequent stage.
In the memory 104 for reception, received data (LDPC codes) that are converted into LDPC codes are stored. The control section 105 supplies a control signal D106 for controlling a variable node computation and a control signal D107 for controlling a check node computation to the check node calculator 101 and the variable node calculator 103, respectively. The control section 105 supplies the control signal D106 to the check node calculator 101 when the messages of all the edges are stored in the memory 100 for edges, and the control section 105 supplies the control signal D107 to the variable node calculator 103 when the messages of all the edges are stored in the memory 102 for edges.
FIG. 10 shows an example of the configuration of the check node calculator 101 of FIG. 9 for performing check node computations one-by-one.
In FIG. 10, the check node calculator 101 is shown by assuming that each message, together with the sign bit, is quantized into a total of six bits. Furthermore, in FIG. 10, a check node computation of LDPC codes represented by the check matrix of FIG. 8 is performed. Furthermore, a clock ck is supplied to the check node calculator 101 of FIG. 10, this clock ck being supplied to necessary blocks. Each block performs processing in synchronization with the clock ck.
Based on, for example, a 1-bit control signal D106 supplied from the control section 105, the check node calculator 101 of FIG. 10 performs computations in accordance with equation (7) by using the messages D101 that are read one-by-one from the memory 100 for edges.
More specifically, in the check node calculator 101, 6-bit messages D101 (messages vi) from the variable node, corresponding to each column of the check matrix, are read one-by-one, the absolute value D122 (|vi|), which is the lower-order bits thereof, is supplied to the LUT 121, and a sign bit D121, which is the highest bit thereof, is supplied to an EXOR circuit 129 and an FIFO (First In First Out) memory 133, respectively. Furthermore, the control signal D106 is supplied to the check node calculator 101 from the control section 105, and the control signal D106 is supplied to a selector 124 and a selector 131.
The LUT 121 reads a 5-bit computation result D123 (φ(|vi|)) such that the computation of φ(|vi|) in equation (7) is performed on the absolute value D122 (|vi|), and supplies it to an adder 122 and an FIFO memory 127.
The adder 122 integrates the computation results D123 by adding together the computation results D123 (φ(|vi|)) and a 9-bit value D124 stored in a register 123, and stores the 9-bit integration value obtained thereby in the register 123 again. When the computation results for the absolute values D122 (|vi|) of the messages D101 from all the edges over one row of the check matrix are integrated, the register 123 is reset.
When the messages D101 over one row of the check matrix are read one-by-one and the integrated value such that the computation results D123 for one row is stored in the register 123, the control signal D106 supplied from the control section 105 changes from 0 to 1. For example, when the row weight is “9”, the control signal D106 is “0” at the first to eighth clocks, and is “1” at the ninth clock.
When the control signal D106 is “1”, the selector 124 selects the value stored in the register 123, that is, the 9-bit value D124 (Σφ(|vi|) from i=1 to i=dc) such that φ(|vi|) determined from the messages D101 (messages vi) from all the edges over one row of the check matrix, and outputs the value as a value D125 to a register 125, whereby it is stored. The register 125 supplies the stored value D125 as a 9-bit value D126 to the selector 124 and the adder 126. When the control signal D106 is “0”, the selector 124 selects the value D126 supplied from the register 125, and outputs the value to the selector 124, whereby it is stored again. That is, until φ(|vi|) determined from the messages D101 (messages vi) from all the edges over one row of the check matrix are integrated, the register 125 supplies the previously integrated φ(|vi|) to the selector 124 and the adder 126.
On the other hand, the FIFO memory 127 delays the computation results D123 (φ(|(|vi|)) output by the LUT 121 until a new value D126 (Σφ(|vi|) from i=1 to i=dc) is output from the register 125, and supplies them as a 5-bit value D127 to a subtractor 126. The subtractor 126 subtracts, from the value D126 supplied from the register 125, the value D127 supplied from the FIFO memory 127, and supplies the subtracted result as a 5-bit subtracted value D128 to the LUT 128. That is, the subtractor 126 subtracts φ(|vi|) determined from the messages D101 (messages vi) from the edge to be determined, from the integrated value of φ(|vi|) determined from the messages D101 (messages vi) from all the edges over one row of the check matrix, and supplies the subtracted value (Σφ(|vi|) from i=1 to i=dc−1) as a subtracted value D128 to the LUT 128.
The LUT 128 outputs the 5-bit computation results D129 (φ−1(Σφ(|vi|))) such that the computation of φ−1(Σφ(|vi|)) in equation (7) is performed on the subtracted value D128 (Σφ(|vi|) from i=1 to i=dc−1).
In parallel with the above processing, the EXOR circuit 129 performs a multiplication of sign bits by computing the exclusive OR of a 1-bit value D131 stored in a register 130 and the sign bit D121, and stores the 1-bit multiplication result D130 in the register 130 again. When the sign bits D121 of the messages D101 from all the edges over one row of the check matrix are multiplied, the register 130 is reset.
When the multiplied results D130 (Πsign (vi) from i=1 to dc) such that the sign bits D121 of the messages D101 from all the edges over one row of the check matrix are multiplied are stored, the control signal D106 supplied from the control section 105 changes from “0” to “1”.
When the control signal D106 is “1”, the selector 131 selects the value stored in the register 130, that is, the value D131 (Πsign (vi) from i=1 to i=dc) such that the sign bits D121 of the messages D101 from all the edges over one row of the check matrix are multiplied, and outputs the value as a 1-bit value D133 to a register 132, whereby it is stored. The register 132 supplies the stored value D132 as a 1-bit value D132 to the selector 131 and the EXOR circuit 134. When the control signal D106 is “0”, the selector 131 selects a value D133 supplied from the register 132, and outputs the value to the register 132, whereby it is stored again. That is, until the sign bits D121 of the messages D101 (messages vi) from all the edges over one row of the check matrix are multiplied, the register 132 supplies the value stored at the previous time to the selector 131 and the EXOR circuit 134.
On the other hand, the FIFO memory 133 delays the sign bits D121 until a new value D133 (Πsign (vi) from i=1 to i=dc) is supplied from the register 132 to the EXOR circuit 134, and supplies the result as a 1-bit value D134 to the EXOR circuit 134. The EXOR circuit 134 divides the value D133 by the value D134 by computing the exclusive OR of the value D133 supplied from the register 132 and the value D134 supplied from the FIFO memory 133, and outputs a 1-bit divided result as a divided value D135. That is, the EXOR circuit 134 divides the multiplication value of the sign bits D121 (sign (|vi|)) of the messages D101 from all the edges over one row of the check matrix by the sign bits D121 (sign (|vi|)) of the messages D101 from the edge to be determined, and outputs the divided value (Πsign (|vi|) from i=1 to i=dc−1) as a divided value D135.
In the check node calculator 101, a total of six bits such that the 5-bit computation result D129 output from the LUT 128 is the lower-order 5 bits and the 1-bit divided value D135 output from the EXOR circuit 134 is the highest-order bit is output as a message D102 (message uj).
As described above, in the check node calculator 101, the computation of equation (7) is performed, and a message uj is determined.
Since the maximum of the row weight of the check matrix of FIG. 8 is 9, that is, since the maximum number of the messages supplied to the check node is 9, the check node calculator 101 has an FIFO memory 127 and the FIFO memory 133 for delaying nine messages (φ(|vi|)). When a message of the row whose weight is less than 9 is to be calculated, the amount of delay in the FIFO memory 127 and the FIFO memory 133 is reduced to the value of the row weight.
FIG. 11 shows an example of the configuration of the variable node calculator 103 of FIG. 9, for performing variable node calculations one-by-one.
In FIG. 11, the variable node calculator 103 is shown by assuming that each message, together with the sign bit, is quantized into a total of six bits. In FIG. 11, the variable node computation of LDPC codes represented by the check matrix of FIG. 8 is performed. Furthermore, a clock ck is supplied to the variable node calculator 103 of FIG. 11, and the clock ck is supplied to necessary blocks. Each block performs processing in synchronization with the clock ck.
Based on, for example, a 1-bit control signal D107 supplied from the control section 105, the variable node calculator 103 of FIG. 11 performs computations in accordance with equation (1) by using the messages D103 that are read one-by-one from the memory 102 for edges and the received data D104 that is read from the memory 104 for reception.
More specifically, in the variable node calculator 103, 6-bit messages D103 (messages uj) from the check node corresponding to each row of the check matrix is read one-by-one, and the messages D103 are supplied to the adder 151 and the FIFO memory 155. Furthermore, in the variable node calculator 103, 6-bit received data D104 are read one-by-one from the memory 104 for reception, and is supplied to the adder 156. Furthermore, a control signal D107 is supplied to the variable node calculator 103 from the control section 105, and the control signal D107 is supplied to a selector 153.
The adder 151 integrates the messages D103 by adding together the messages D103 (messages uj) and a 9-bit value D151 stored in the register 152, and stores the 9-bit integrated value in the register 152 again. When the message D103 from all the edges over one row of the check matrix are integrated, the register 152 is reset.
When the messages D103 from all the edges over one row of the check matrix are read one-by-one, and the value such that the messages D103 for one column are integrated is stored in the register 152, the control signal D107 supplied from the control section 105 changes from “0” to “1”. For example, when the column weight is “5”, the control signal D107 is “0” at the first clock up to the fourth clock and is “0” at the fifth clock.
When the control signal D107 is “1”, the selector 153 selects the value stored in the register 152, that is, a 9-bit value D151 (Σuj from j=1 to dv) such that the messages D103 (messages uj) from all the edges over one row of the check matrix are integrated, and outputs the value to the register 154, whereby it is stored. The register 154 supplies the stored value D151 as a 9-bit value D152 to the selector 153 and the adder-subtractor 156. When the control signal D107 is “0”, the selector 153 selects a value D152 supplied from the register 154, and outputs the value to a register 154, whereby it is stored again. That is, until the messages D103 (messages uj) from all the edges over one row of the check matrix are integrated, the register 154 supplies the previously integrated value to the selector 153 and the adder-subtractor 156.
On the other hand, the FIFO memory 155 delays the message D103 from the check node until a new value D152 (Σuj from j=1 to dv) is output from the register 154, and supplies it as a 6-bit value D153 to the adder-subtractor 156. The adder-subtractor 156 subtracts the value D153 supplied from the FIFO memory 155, from the value D152 supplied from the register 154. That is, the adder-subtractor 156 subtracts the message uj from the edge to be determined, from the integrated value of the messages D103 (messages uj) from all the edges over one row of the check matrix, and determines the subtracted value (Σuj from j=1 to dv−1). Furthermore, the adder-subtractor 156 adds the received data D104 supplied from the memory 104 for reception to the subtracted value (Σuj from j=1 to dv1), and outputs the 6-bit value obtained thereby as a message D105 (message vi).
As described above, in the variable node calculator 103, the computation of equation (1) is performed, and the message vi is determined.
Since the maximum of the column weight of the check matrix of FIG. 8 is 5, that is, since the maximum number of the messages supplied to the variable node is 5, the variable node calculator 103 has an FIFO memory 155 for delaying five messages (uj). When a message of a column whose weight is less than 5 is to be calculated, the amount of delay in the FIFO memory 155 is reduced to the value of the column weight.
In the decoding apparatus of FIG. 9, a control signal is supplied from the control section 105 in accordance with the weight of the check matrix. According to the decoding apparatus of FIG. 9, if only the capacities of the memories for edges 100 and 102 and the FIFO memories 127, 133, and 155 of the check node calculator 101 and the variable node calculator 103 are sufficient, LDPC codes of various check matrices can be decoded by changing only the control signal.
Although not shown, in the decoding apparatus of FIG. 9, at the final stage of the decoding, instead of the variable node calculation of equation (1), the computation of equation (5) is performed, and the computation result is output as the final decoded result.
When LDPC codes are decoded by repeatedly using the decoding apparatus of FIG. 9, the check node computation and the variable node computation are alternately performed. That is, in the decoding apparatus of FIG. 9, a variable node computation is performed by the variable node calculator 103 by using the result of the check node computation by the check node calculator 101, and a check node computation is performed by the check node calculator 101 by using the result of the variable node computation by the variable node calculator 103.
Therefore, for performing one decoding using the check matrix having 269 edges of FIG. 8, 269×2=538 clocks are required. For example, in order to perform 50 iterative decodings, 538×50=26900 clock operations are necessary while one frame in which 90 codes (received data) are set as one frame, which is the code length, is received, and thus, a high-speed operation approximately 300 (≅26900/90) times as high as the receiving frequency becomes necessary. If the receiving frequency is assumed to be several tens of MHz, operation at a speed of GHz or higher is required.
Furthermore, in a case where, for example, 50 decoding apparatuses of FIG. 9 are concatenated to decode LDPC codes, a plurality of variable node calculations and check node calculations can be performed simultaneously. For example, while a variable node computation of the first frame is being performed, a check node computation of the second frame is performed, and a variable node computation of the third frame is performed. In this case, while 90 codes are received, since 269 edges need to be calculated, the decoding apparatus needs to operate at a frequency approximately 3 (≅269/90) times as high as the receiving frequency, and thus realization is sufficiently possible. However, in this case, the circuit scale becomes, in simple terms, 50 times as large as the decoding apparatus of FIG. 9.
Next, a description is given of the implementation method of the decoding apparatus in a case where decoding is performed by simultaneously performing computations of all the nodes (full parallel decoding).
This implementation method is described in, for example, C. Howland and A. Blanksby, “Parallel Decoding Architectures for Low Density Parity Check Codes”, Symposium on Circuits and Systems, 2001.
FIGS. 12A to 12C show the configuration of examples of the decoding apparatus for decoding the codes (a coding rate of ⅔, and a code length of 90) represented by the check matrix of FIG. 8. FIG. 12A shows the overall configuration of the decoding apparatus. FIG. 12B shows the detailed configuration of the upper portion in the figure surrounded by the dotted line B, of the decoding apparatus of FIG. 12A. FIG. 12C shows the detailed configuration of the lower portion in the figure surrounded by the dotted line C, of the decoding apparatus of FIG. 12A.
The decoding apparatus of FIGS. 12A to 12C includes one memory 205 for reception, two edge interchange devices 200 and 203, two memories 202 and 206 for edges, a check node calculator 201 made up of 30 check node calculators 2011 to 20130, and a variable node calculator 204 made up of 90 variable node calculators 2041 to 20490.
In the decoding apparatus of FIGS. 12A to 12C, all the message data corresponding to 269 edges is read simultaneously from the memory 202 or 206 for edges, and by using the message data, new message data corresponding to the 269 edges is computed. Furthermore, all the new message data determined as a result of the computation is simultaneously stored in the memory 206 or 202 for edges at a subsequent stage. By repeatedly using the decoding apparatus of FIGS. 12A to 12C, iterative decoding is realized. Each section will now be described below in detail.
The memory 206 for edges simultaneously stores all the messages D2061 to D20690 from the variable node calculators 2041 to 20490 at a previous stage, reads the messages D2061 to D20690 as messages D2071 to D20790 at the next clock (the timing of the next clock), and supplies them as messages D200 (D2001 to D20090) to the edge interchange device 200 at the subsequent stage. The edge interchange device 200 rearranges (interchanges) the order of the messages D2001 to D20090 supplied from the memory 206 for edges in accordance with the check matrix of FIG. 8, and supplies them as messages D2011 to D20130 to the check node calculators 2011 to 20130.
The check node calculators 2011 to 20130 perform a computation in accordance with equation (7) by using the messages D2011 to D20130 supplied from the edge interchange device 200, and supplies the messages D2021 to D20230 obtained as a result of the computation to the memory 202 for edges.
The memory 202 for edges simultaneously stores all the messages D2021 to D20230 supplied from the check node calculators 2011 to 20130 at the previous stage, and at the next time, supplies all the messages D2021 to D20230, as messages D2031 to D20330, to the edge interchange device 203 at the subsequent stage.
The edge interchange device 203 rearranges the order of the messages D2031 to D20330 supplied from the memory 202 for edges in accordance with the check matrix of FIG. 8, and supplies them as messages D2041 to D20490 to the variable node calculators 2041 to 20490.
The variable node calculators 2041 to 20490 perform a computation in accordance with equation (1) by using the messages D2041 to D20490 supplied from the edge interchange device 203 and the received data D2051 to D20590 supplied from the memory 205 for reception, and supplies messages D2061 to D20690 obtained as a result of the computation to the memory 206 for edges at the subsequent stage.
FIG. 13 shows an example of the configuration of a check node calculator 201m (m=1, 2, . . . , 30) of FIGS. 12A to 12C for simultaneously performing check node calculations.
In the check node calculator 201m of FIG. 13, similarly to the check node calculator 101 of FIG. 10, the check node computation of equation (7) is performed, and the check node calculations are simultaneously performed for all the edges.
More specifically, in the check node calculator 201m of FIG. 13, all the messages D2211 to D2219 (vi) from the variable node corresponding to each row of the check matrix of FIG. 8, which are supplied from the edge interchange device 200, are read simultaneously, and the absolute values D2221 to D2229 (|vi|) which are the respective lower-order 5 bits thereof, are supplied to the LUTs 2211 to 2219, respectively. 1-bit sign bits D2231 to D2239, which are the highest-order bits of the message D2211 to D2219(vi), are supplied to the EXOR circuits 2261 to 2269, respectively, and are also supplied to the EXOR circuit 225.
The LUTs 2211 to 2219 read 5-bit computation results D2241 to D2249 (φ(|vi|)) such that the computation of φ(|vi|) in equation (7) is performed, respectively, on the absolute values D2221 to D2229 (|vi|), respectively, and supplies them to the respective subtractors 2231 to 2239. The LUTs 2211 to 2219 supply the computation results D2241 to D2249 (φ(|vi|)) to an adder 222.
The adder 222 computes the total sum of the values of the computation results D2241 to D2249 (φ(|vi|)) (the total sum of the computation results for one row), and supplies the 9-bit computation results D225 (Σφ(|vi|) from i=1 to 9) to the subtractors 2231 to 2239. The subtractors 2231 to 2239 subtract the computation results D2241 to D2249 (φ(|vi|)) from the computation results D225, respectively, and supply the 5-bit subtracted value D2271 to D2279 to the LUTs 2241 to 2249. That is, the subtractors 2231 to 2239 subtract φ(|vi|) determined from the message vi from the edge to be determined, from the integrated value of φ(|vi|) determined from the message vi from all the edges, and supply the subtracted values D2271 to D2279 (Σφ(|vi|) from i=1 to 8) to the LUTs 2241 to 2249, respectively. The LUTs 2241 to 2249 read the 5-bit computation results D2281 to D2289 such that the computation of φ−1 (Σφ(|vi|)) in equation (7) is performed on the subtracted values D2271 to D2279, and outputs them.
On the other hand, the EXOR circuit 225 performs a multiplication of the sign bits D2231 to D2239 by computing the exclusive OR of all the sign bits D2231 to D2239, and supplies a 1-bit multiplication value D226 (multiplication value of the sign bits for one row (Πsign (vi) from i=1 to 9)) to the respective EXOR circuit 2261 to 2269. By computing the exclusive OR of the multiplication value D226 and the sign bits D2231 to D2239, respectively, the EXOR circuits 2261 to 2269 determine 1-bit divided values D2291 to D2299 (Πsign (vi) from i=1 to 8) such that the multiplication value D226 is divided by the sign bits D2231 to D2239, respectively, and output them.
In the check node calculator 201m, a total of six bits such that the 5-bit computation results D2281 to D2289 output from the LUTs 2241 to 2249 are each made to be the five lower-order bits and the divided values D2291 to D2299 output from the EXOR circuits 2261 to 2269 are each made to be the highest-order bit is output as messages D2301 to D2309 obtained as a result of the check node computation.
In the manner described above, in the check node calculator 201m, the computation of equation (7) is performed, and the message uj is determined.
In FIG. 13, the check node calculator 201m is shown by assuming that each message, together with the sign bit, is quantized to a total of six bits. The circuit of FIG. 13 corresponds to one check node. For the check matrix to be processed here in FIG. 8, since check nodes of 30 rows, which is the number of the rows thereof, exist, the decoding apparatus of FIGS. 12A to 12C has 30 check node calculators 201m shown in FIG. 13.
In the check node calculator 201m of FIG. 13, nine messages can be calculated simultaneously. For the row weight of the check matrix to be processed here in FIG. 8, the weight of the first row is 8, and the weight of the second row is 9, that is, there is one case in which the number of messages supplied to the check node is 8 and there are nine cases in which the number of messages is 9. Therefore, the check node calculator 2011 has a circuit configuration capable of simultaneously calculating eight messages similarly to the circuit of FIG. 13, and the remaining check node calculators 2012 to 20130 are configured in the same way as for the circuit of FIG. 13.
FIG. 14 shows an example of the configuration of a variable node calculator 204p (p=1, 2, . . . , 90) of FIGS. 12A to 12C for simultaneously performing variable node computations.
In the variable node calculators 204p of FIG. 14, similarly to the variable node calculator 103 of FIG. 11, the variable node computations of equation (1) are performed, and the variable node computations are simultaneously performed for all the edges.
More specifically, in the variable node calculators 204p of FIG. 14, all the 6-bit messages D2511 to D2515 (messages uj) from the check node corresponding to each row of the check matrix, which are supplied from the edge interchange device 203, are read simultaneously, and these messages are supplied to the respective adders 2521 to 2525 and are also supplied to the adder 251. Furthermore, received data D271 is supplied to the variable node calculator 204p from the memory 205 for reception and the received data D271 is supplied to the adders-subtractors 2521 to 2525.
The adder 251 integrates all the messages D2511 to D2515 (messages uj), and supplies a 9-bit integrated value D252 (the total sum value of messages for one column (Σuj from j=1 to 5)) to the adders-subtractors 2521 to 2525. The adders-subtractors 2521 to 2525 subtract the messages D2511 to D2515 (messages uj) from the added value D252, respectively. That is, the adders-subtractors 2521 to 2525 subtract the messages D2511 to D2515 (messages uj) from the edge to be determined, from the integrated value D252 of the messages uj from all the edges, respectively, and determine the subtracted value (Σuj from j=1 to 4).
Furthermore, the adders-subtractors 2521 to 2525 add the received data D271 (u0i) to the subtracted value (Σuj from j=1 to 4), and output 6-bit added values D2531 to D2535 as the results of the variable node computations.
In the manner described above, in the variable node calculator 204p, the computation of equation (1) is performed, and the message vi is determined.
In FIG. 14, the variable node calculators 204p is shown by assuming that each message, together with the sign bit, is quantized to six bits. The circuit of FIG. 14 corresponds to one variable node. For the check matrix to be processed here in FIG. 8, since variable nodes of 90 columns, which is the number of the columns thereof, exist, the decoding apparatus of FIGS. 12A to 12C has 90 circuits shown in FIG. 14.
In the variable node calculators 204p of FIG. 14, it is possible to simultaneously calculate five messages. The check matrix to be processed here in FIG. 8 has 15, 45, 29, and 1 columns having weights of 5, 3, 2, and 1, respectively. Therefore, 15 variable node calculators out of the variable node calculators 2041 to 20490 have the same circuit configuration as that of the circuit of FIG. 14. The remaining 45, 29, and 1 variable node calculators have the circuit configuration capable of simultaneously calculating 3, 2, and 1 messages similarly to the circuit of FIG. 14.
Although not shown, also, in the decoding apparatus of FIGS. 12A to 12C, similarly to the case of FIG. 9, at the final stage of the decoding, instead of the variable node calculation of equation (1), the computation of equation (5) is performed, and the computation result is output as the final decoded result.
According to the decoding apparatus of FIGS. 12A to 12C, it is possible to simultaneously calculate all the messages corresponding to 269 edges at one clock.
When decoding is performed by repeatedly using the decoding apparatus of FIGS. 12A to 12C, the check node computation and the variable node computation are alternately performed, and one decoding can be performed at two clocks. Therefore, for example, in order to perform 50 decodings, the decoding apparatus needs to operate at 2×50=100 clocks while received data in which codes having a code length of 90 are one frame is received, and thus, approximately the same operating frequency as the receiving frequency may be used. In general, since the code length of the LDPC codes is as great as several thousands to several tens of thousands, if the decoding apparatus of FIGS. 12A to 12C is used, the number of decodings can be greatly increased, and the improvement in the error correction performance can be expected.
However, in the decoding apparatus of FIGS. 12A to 12C, since computations of messages corresponding to all the edges of a Tanner graph are performed in parallel, the circuit scale increases in proportion to the code length. When the decoding apparatus of FIGS. 12A to 12C is configured as an apparatus for performing the decoding of LDPC codes having a particular check matrix, of a particular code length and a particular coding rate, it is difficult for the decoding apparatus to perform the decoding of LDPC codes having another check matrix, of another code length and another coding rate. That is, unlike the decoding apparatus of FIG. 9, it is difficult for the decoding apparatus of FIGS. 12A to 12C to deal with the decoding of various codes even if the control signal is changed only, and the dependence on codes is high.
In addition to the decoding apparatus of FIG. 9 and FIGS. 12A to 12C, the implementation method for simultaneously calculating messages in units of four messages rather than one message or all messages is described in, for example, E. Yeo, P. Pakzad, B. Nikolic and V. Anantharam, “VLSI Architectures for Iterative Decoders in Magnetic Recording Channels”, IEEE Transactions on Magnetics, Vol. 37, No. 2, March 2001. In this case, there are problems in that, generally, it is not easy to avoid simultaneous read-out from or simultaneous writing to different addresses of the memory, and memory access control is difficult.
Furthermore, a method of implementation by approximating the sum product algorithm has also been proposed. However, in this method, the deterioration of performance is caused to occur. For implementing the sum product algorithm as hardware, there are, as described above, a method in which computations of messages corresponding to the edges (a check node computation and a bit node computation) are serially performed one-by-one, a method in which all the computations of messages are performed in parallel (full parallel), and a method in which the computations of messages are performed in units of several computations in parallel (parallel).
However, in the method in which computations of messages corresponding to the edges are performed one-by-one, a high operating frequency is required. Accordingly, as a method for increasing throughput, a method for arranging the apparatus in a pipeline structure is known. In this case, the circuit scale, in particular, (the capacity of) the memory, increases.
In the method in which all the computations of messages are performed in parallel, the circuit scale for logic increases, and the dependence on codes is high.
In the method in which the computations of messages are performed in units of several computations in parallel, control of memory access is difficult.