1. Field of the Invention
The present invention relates to signal processing, and, in particular, to error correction encoding and decoding techniques such as low-density parity-check (LDPC) encoding and decoding.
2. Description of the Related Art
FIG. 1 shows one implementation of a parity-check matrix 100 that may be used to implement a regular, quasi-cyclic (QC) LDPC code. Parity-check matrix 100, commonly referred to as an H-matrix, comprises 40 circulants Bj,k that are arranged in r=4 rows of circulants where j=1, . . . , r and c=10 columns of circulants where k=1, . . . , c. A circulant is a sub-matrix that is either equal to an identity matrix or is obtained by cyclically shifting an identity matrix, and a quasi-cyclic LDPC code is an LDPC code in which all of the sub-matrices are circulants. In H-matrix 100, each circulant Bj,k is a p×p sub-matrix that may be obtained by circularly shifting a single p×p identity matrix. For purposes of this discussion, assume that p=72 such that H-matrix 100 has p×r=72×4=288 total rows and p×c=72×10=720 total columns. Since each circulant Bj,k is a permutation of an identity matrix, the hamming weight (i.e., the number of entries having a value of one) of each column in a circulant and the hamming weight of each row in a circulant are both equal to 1. Thus, the total hamming weight wr for each row of H-matrix 100 is equal to 1×c=1×10=10, and the total hamming weight wc for each column of H-matrix 100 is equal to 1×r=1×4=4. Each of the 288 rows of H-matrix 100 corresponds to an mth check node, where m ranges from 1, . . . , 288, and each of the 720 columns corresponds to an nth variable node (also referred to as a bit node), where n ranges from 1, . . . , 720. Further, each check node is connected to wr=10 variable nodes as indicated by the 1s in a row, and each variable node is connected to wc=4 check nodes as indicated by the 1s in a column. H-matrix 100 may be described as a regular LDPC code since all rows of H-matrix 100 have the same hamming weight wr and all columns of H-matrix 100 have the same hamming weight wc.
FIG. 2 shows a simplified block diagram of one implementation of a prior-art LDPC decoder 200 that may be used to decode a signal encoded using an H-matrix such as H-matrix 100 of FIG. 1. LDPC decoder 200 receives 720 soft values (e.g., log-likelihood ratios) from a soft detector such as a soft-output Viterbi detector and stores these soft values in soft-value memory 202. Each soft value corresponds to one bit of a received LDPC-encoded codeword. The encoded codeword is decoded iteratively using a belief propagation technique, where each iteration is performed in a number of clock cycles that is equal to the number c of circulant columns (e.g., 10 clock cycles/iteration for H-matrix 100).
During the first clock cycle of the initial iteration, soft-value memory 202 provides the first 72 of 720 soft values in parallel to 72 variable node units (VNUs) 204(0), . . . , (71), such that each soft value is provided to a different VNU 204. VNUs 204(0), . . . , (71) perform variable node updates for the first 72 columns of H-matrix 100 (i.e., for the first circulant column comprising circulants B1,1, B2,1, B3,1, and B4,1). Specifically, each VNU 204 generates one variable node message for each of the four circulants B1,1, B2,1, B3,1, and B4,1 (e.g., one message for each column entry having a value of 1 implies four messages per column), such that the total number of variable node messages generated by VNUs 204(0), . . . , (71) is equal to 4×72=288. During the initial iteration (i.e., i=0), each variable node message may be generated per Equation (1) as follows:Qnm(0)=Ln(0), where  (1)Qnm(0) is the variable node message provided from the nth variable node to the mth check node for the 0th iteration and Ln(0) is the initial soft value received from soft-value memory 202 that corresponds to the nth variable node. The operation of VNUs 204(0), . . . , (71) is discussed in further detail below in relation to FIG. 3.
VNUs 204(0), . . . , (71) provide the 4×72 variable node messages (herein referred to as Q messages) that they generate to four 72-way barrel shifters 206(0), . . . , (3). In particular, the 72 Q messages generated in relation to circulant B1,1, the 72 Q messages generated in relation to circulant B2,1, the 72 Q messages generated in relation to circulant B3,1, and the 72 Q messages generated in relation to circulant B4,1 are provided to separate barrel shifters 206(0), . . . , (3), respectively. Barrel shifters 206(0), . . . , (3) cyclically shift the Q messages that they receive based on cyclic-shift factors that (i) correspond to the cyclic shifts of circulants B1,1, B2,1, B3,1, and B4,1 of H-matrix 100 of FIG. 1 and (ii) may be received from, for example, controller 214. The four barrel shifters 206 then provide 4×72 cyclically shifted Q messages to 4×72 check node units (CNUs) 208(0), . . . , (287), such that each CNU 208 receives a different one of the Q messages.
During the second clock cycle of the first iteration, VNUs 204(0), . . . , (71) receive the second 72 of 720 soft values from soft-value memory 202. VNUs 204(0), . . . , (71) perform variable node updates for the second 72 columns of H-matrix 100 (i.e., for the second circulant column comprising circulants B1,2, B2,2, B3,2, and B4,2) in a manner similar to that described above in relation to the first clock cycle (e.g., using Equation (1)) and provide 4×72 Q messages to barrel shifters 206(0), . . . , (3). Barrel shifters 206(0), . . . , (3) cyclically shift the 4×72 Q messages according to the cyclic shifts of circulants B1,2, B2,2, B3,2, and B4,2 of H-matrix 100 of FIG. 1 and provide 4×72 cyclically shifted Q messages to check node units (CNUs) 208(0), . . . , (287). Note that, cyclic shifting of the 4×72 Q messages is performed such that each Q message is distributed to the same CNU as the Q message from the prior clock cycle that corresponds to the same row (i.e., the same check node) of H-matrix 100. This process is repeated for the remaining eight circulant columns during the remaining eight clock cycles of the iteration.
Referring now to CNUs 208(0), . . . , (287), during the first iteration (i.e., the first 10 clock cycles), each of the CNUs receives a number of Q messages equal to the hamming weight wr of a row of H-matrix 100 (e.g., 10) and generates wr check node messages. Each check node message may be calculated using a min-sum algorithm, characterized by Equations (2), (3), and (4) shown below:
                              R          mn                      (            i            )                          =                              δ            mn                          (              i              )                                ⁢                      κ            mn                          (              i              )                                                          (        2        )                                          κ                      mn            ⁢                                                                      (            i            )                          =                                                        R              mn                              (                i                )                                                          =                                    min                                                n                  ′                                ∈                                                      N                    ⁡                                          (                      m                      )                                                        /                  n                                                      ⁢                                                        Q                                                      n                    ′                                    ⁢                  m                                                  (                                      i                    -                    1                                    )                                                                                                      (        3        )                                                      δ            mn                          (              i              )                                =                      (                                          ∏                                                      n                    ′                                    ∈                                                            N                      ⁡                                              (                        m                        )                                                              /                    n                                                                                                                ⁢                                                          ⁢                              sign                ⁡                                  (                                      Q                                                                  n                        ′                                            ⁢                      m                                                              (                                              i                        -                        1                                            )                                                        )                                                      )                          ,                            (        4        )            where Rmn(i) represents the check node message (herein referred to as the R message) from the mth check node to the nth variable node for the ith iteration. Suppose that n′ is a variable node in the set N(m)/n of all variable nodes connected to the mth check node except for the nth variable node (i.e., n′ ε N(m)/n). The mth check node generates message Rmn(i) based on all Q messages received during the previous (i−1)th iteration from the set N(m)/n. Thus, in the embodiment of FIG. 2, each R message is generated based on nine Q messages (i.e., wr−1=10−1).
Message Rmn(i) may be calculated in several steps. First, the mth check node generates a sign δmn(i) for message Rmn(i) by taking the product of the signs of the Q messages in set n′ as shown in Equation (4). This can also be performed using binary addition, such as a modulo 2 operation, of the signs rather than multiplication. Next, the mth check node generates a magnitude |Rmn(i)| for message Rmn(i) by determining the minimum magnitude of the Q messages in set N(m)/n as shown in Equation (3). Then, the mth check node multiplies sign δmn(i) by magnitude |Rmn(i)| as shown in Equation (2). Note that, other variations of the min-sum algorithm are possible such as an offset min-sum algorithm and a normalized min-sum algorithm. Further, CNU algorithms other than the min-sum algorithm, such as the sum product algorithm, may be used.
The min-sum algorithm described in Equations (2), (3), and (4) may be simplified using a value-reuse technique. For example, consider that, during an iteration, each CNU 208 receives ten Q messages and generates ten R messages. Each R message is generated using a set of N(m)/n=9 Q messages (one message is excluded as described above). For nine of these R messages, the minimum magnitude of the Q messages generated using Equation (3) will be the same. For one of these R messages, the minimum magnitude of the Q messages will be the second smallest magnitude of the Q messages because the minimum magnitude of the Q messages will be excluded from the calculation as described above. Thus, it is not necessary to perform Equation (3) ten times for each CNU. Rather, each CNU may receive its corresponding ten Q messages during an iteration, store the two Q messages with the smallest magnitude, and store an index value corresponding to the minimum magnitude. The index value may be used to match the second smallest magnitude with the correct R message.
Referring back to FIG. 2, the min-sum algorithm performed by CNUs 208(0), . . . , (287) may be a two-step process performed over two iterations. For example, during the ith iteration (i.e., 10 clock cycles), each CNU 208 receives and processes ten Q messages. These messages may be processed by (1) determining the minimum and second minimum values, (2) summing the signs of the ten Q messages, and (3) providing the signs of the ten Q messages sequentially to FIFO 210. Each CNU 208 does not begin outputting the ten R messages it generates until the (i+1)th iteration (i.e., after it has received all ten Q messages). During the (i+1)th iteration, each CNU 208 may receive ten new Q messages at a rate of one per clock cycle and may output the ten R messages at a rate of one per clock cycle. Upon being output, each R message is multiplied by a different sign value δmn(i) that may be obtained by adding (i) the sum of the signs of the ten Q messages and (ii) a sign received from FIFO 210 that corresponds to message Qmn(i-1). In so doing, one sign of the Q messages is excluded from sign value δmn(i) as shown in Equation (4) (i.e., sign δmn(i) is generated based on the signs of nine Q messages rather than ten).
During each clock cycle, each barrel shifter 212 receives 72 R messages in parallel and cyclically shifts the R messages according to the cyclic shifts of the circulants Bj,k of H-matrix 100 of FIG. 1, which may be provided by controller 214. Essentially, barrel shifters 212(0), . . . , (3) reverse the cyclic shifting of barrel shifters 206(0), . . . , (3). Barrel shifters 212(0), . . . , (3) then provide the 4×72 cyclically shifted R messages to VNUs 204(0), . . . , (71), such that each VNU 204 receives four of the R messages.
During the second iteration, each VNU 204 updates each of the four Q messages that it generates as shown in Equation (5):
                                          Q                          n              ⁢                                                          ⁢              m                                      (              i              )                                =                                    L              n                              (                0                )                                      +                                          ∑                                                      m                    ′                                    ∈                                                            M                      ⁡                                              (                        n                        )                                                              /                    m                                                                                                                ⁢                              R                                                      m                    ′                                    ⁢                  n                                                  (                                      i                    -                    1                                    )                                                                    ,                            (        5        )            where m′ is a check node in the set M(n)/m of all check nodes connected to the nth variable node except the mth check node (i.e., m′ ε M(n)/m). The nth variable node generates message Qmn(i) based on (i) all R messages received during the previous (i−1)th iteration from the set M(n)/m and (ii) the initial soft value Ln(0) received from soft value memory 202 that corresponds to the nth variable node.
In addition to outputting four updated Q messages, each VNU 204 outputs both (i) a soft value (i.e., an extrinsic LLR) and (ii) a hard-decision bit for each variable node. Each extrinsic LLR value may be represented as shown in Equation (6):
                                          Extrinsic            ⁢                                                  ⁢                          Value              n                                =                                    ∑                              m                ∈                                  M                  ⁡                                      (                    n                    )                                                                                                                    ⁢                          R              mn                              (                i                )                                                    ,                            (        6        )            where m is a check node in the set M(n) of all check nodes connected to the nth variable node (i.e., m ε M(n)). Each hard-decision bit {circumflex over (x)}n may be generated based on Equations (7), (8), and (9) below:
                              P          n                =                              L            n                          (              0              )                                +                                    ∑                              m                ∈                                  M                  ⁡                                      (                    n                    )                                                                                                                    ⁢                          R              mn                              (                i                )                                                                        (        7        )                                                      x            ^                    n                =                              0            ⁢                                                  ⁢            if            ⁢                                                  ⁢                          P              n                                ≥          0                                    (        8        )                                                      x            ^                    n                =                              1            ⁢                                                  ⁢            if            ⁢                                                  ⁢                          P              n                                <          0                                    (        9        )            Pn is determined for each variable node by adding the extrinsic value from Equation (6) to the initial soft value Ln(0) received from soft-value memory 202 that corresponds to the nth variable node. If Pn is greater than or equal to zero, then the hard-decision bit {circumflex over (x)}n is set equal to zero as shown in Equation (8). If Pn is less than zero, then the hard-decision bit {circumflex over (x)}n is set equal to one as shown in Equation (9).
A parity check is then performed using the hard-decision values. If {circumflex over (x)}HT=0, where HT is the transpose of H-matrix 100 of FIG. 1, then the decoding process is finished. If {circumflex over (x)}HT≠0, then a subsequent iteration is performed to generate a new set of extrinsic LLR values and hard decisions. If the decoding process does not end within a predefined number of iterations, then the decoding process is terminated and the received codeword has not been properly decoded.
FIG. 3 shows a simplified block diagram of one implementation of a prior-art VNU 300 that may be used to implement each VNU 204 of FIG. 2. During each iteration, except for the initial iteration, VNU 300 (i) receives four R messages R1, R2, R3, and R4 and a soft value and (ii) generates four Q messages Q1, Q2, Q3, Q4 using Equation (5); a hard-decision output value {circumflex over (x)}n using Equations (7), (8), and (9); and an extrinsic LLR value using Equation (6). The soft value, the R messages, the Q messages, and the extrinsic LLR value may each be represented using a number b of bits that may vary from one implementation to the next, while the hard output value {circumflex over (x)}n is typically represented using one bit.
R messages R1, R2, R3, and R4 are converted from sign-magnitude format to two's-complement format using sign-magnitude-to-two's-complement (S2T) converters 302(0), . . . , (3), respectively. The four converted R messages are added together to generate the extrinsic LLR value as shown in Equation (6) using two adder stages. The first adder stage comprises (i) adder 304(0), which adds messages R1 and R2 (i.e., R1+R2), and (ii) adder 304(1), which adds messages R3 and R4 (i.e., R3+R4). The second adder stage comprises adder 306, which adds (i) the sum of messages R1 and R2 to (ii) the sum of messages R3 and R4 to generate the extrinsic LLR (i.e., R1+R2+R3+R4). The extrinsic LLR may be normalized and truncated (i.e., 308) as discussed in further detail below. The normalized, truncated extrinsic LLR value may then be saturated (e.g., SAT 312) and output from VNU 300. Saturation may be performed such that the normalized, truncated extrinsic value is maintained within a specified range. For example, if a range of ±15 is specified, a normalized, truncated extrinsic LLR value greater than +15 may be mapped to +15 and a normalized, truncated extrinsic LLR value less than −15 may be mapped to −15.
The normalized, truncated extrinsic LLR value is also used to generate a hard-decision output value {circumflex over (x)}n. In particular, the normalized, truncated extrinsic LLR value is provided to a third adder stage that comprises adder 310. Adder 310 generates a value P as shown in Equation (7) by adding the normalized, truncated extrinsic LLR value to the soft value (i.e., P=R1+R2+R3+R4+soft value). The sign bit of P is then used to generate the hard-decision value {circumflex over (x)}n. When using two's complement format, the most significant bit (MSB) (i.e., the bit furthest to the left) of a b-bit binary number is the sign bit of the binary number. If the sign bit is 0, then the binary number is ≧0, and if the sign bit is 1, then the binary number is <0. Thus, if the sign bit of P is 0, then P≧0 and the hard-decision value is 0 as shown in Equation (8). If the sign bit of P is 1, then P<0 and the hard-decision value is 1 as shown in Equation (9). The hard-decision value {circumflex over (x)}n may be determined without any additional hardware, for example, by outputting only the MSB and dropping all other bits from P.
Referring back to S2T converters 302(0), . . . , (3), converted messages R1, R2, R3, and R4 are normalized and truncated (i.e., 314(0), . . . , (3)) and provided to a fourth adder stage comprising adders 316(0), . . . , (3), such that each normalized, truncated R message is provided to a different adder 316. Each adder 316 generates a Q message as shown in Equation (5) based on (i) the R message that it receives and (ii) the value P generated by adder 310. In particular, message Q1 is generated by subtracting message R1 from P (i.e., Q1=R1+R2+R3+R4+soft value−R1), message Q2 is generated by subtracting message R2 from P (i.e., Q2=R1+R2+R3+R4+soft value−R2), message Q3 is generated by subtracting message R3 from P (i.e., Q3=R1+R2+R3+R4+soft value−R3), and message Q4 is generated by subtracting message R4 from P (i.e., Q4=R1+R2+R3+R4+soft value−R4). Messages Q1, Q2, Q3, and Q4 may then be saturated (e.g., SAT 318(0), . . . , (3)) in a manner similar to that described above in relation to SAT 312, converted from two's-complement format to sign-magnitude format (e.g., T2S 320(0), . . . , (3)), and output to downstream processing such as barrel shifters 206(0), . . . , (3) of FIG. 2.
In adding the R messages together, the sum of the R messages may grow relatively large, such that more than b bits may be needed to represent the generated Q messages. Normalization and truncation (i.e., 308 and 314(0), . . . , (3)) is employed to ensure that the number b of bits used to represent the R messages and Q messages remains constant. Normalization may be applied, for example, by dividing the sum and each of the four R messages by a factor of two. Truncation may be applied, for example, by deleting the least significant bit (LSB). Further, both normalization and truncation may be applied using no additional hardware. Normalization may be applied using connections that shift the bit values of the sum or R message. Truncation may be applied by removing the connection that corresponds to the LSB. As an example, suppose that b=4 bits and that adder 306 adds 1000 (i.e., −8 in decimal format) and 1000. The resulting sum, 10000 (i.e., −16 in decimal form), may be normalized (e.g., divided by two) by shifting 10000, such that the normalized sum is 1000.0 (i.e., −16/2=−8 in decimal form). The normalized sum 1000.0 may then be truncated to arrive at the 4-bit number 1000.