The present invention is directed to methods and apparatus for detecting and/or correcting errors in binary data, e.g., through the use of parity check codes such as low density parity check (LDPC) codes.
In the modern information age binary values, e.g., ones and zeros, are used to represent and communicate various types of information, e.g., video, audio, statistical information, etc. Unfortunately, during storage, transmission, and/or processing of binary data, errors may be unintentionally introduced, e.g., a one may be changed to a zero or vice versa.
Generally, in the case of data transmission, a receiver observes each received bit in the presence of noise or distortion and only an indication of the bit""s value is obtained. Under these circumstances one interprets the observed values as a source of xe2x80x9csoftxe2x80x9d bits. A soft bit indicates a preferred estimate of the bit""s value, i.e., a one or a zero, together with some indication of that estimate""s reliability. While the number of errors may be relatively low, even a small number of errors or level of distortion can result in the data being unusable or, in the case of transmission errors, may necessitate re-transmission of the data.
In order to provide a mechanism to check for errors and, in some cases, to correct errors, binary data can be coded to introduce carefully designed redundancy. Coding of a unit of data produces what is commonly referred to as a codeword. Because of its redundancy, a codeword will often include more bits than the input unit of data from which the codeword was produced.
When signals arising from transmitted codewords are received or processed, the redundant information included in the codeword as observed in the signal can be used to identify and/or correct errors in or remove distortion from the received signal in order to recover the original data unit. Such error checking and/or correcting can be implemented as part of a decoding process. In the absence of errors, or in the case of correctable errors or distortion, decoding can be used to recover from the source data being processed, the original data unit that was encoded. In the case of unrecoverable errors, the decoding process may produce some indication that the original data cannot be fully recovered. Such indications of decoding failure can be used to initiate retransmission of the data.
While data redundancy can increase the reliability of the data to be stored or transmitted, it comes at the cost of storage space and/or the use of valuable communications bandwidth. Accordingly, it is desirable to add redundancy in an efficient manner, maximizing the amount of error correction/detection capacity gained for a given amount of redundancy introduced into the data.
With the increased use of fiber optic lines for data communication and increases in the rate at which data can be read from and stored to data storage devices, e.g., disk drives, tapes, etc., there is an increasing need not only for efficient use of data storage and transmission capacity but also for the ability to encode and decode data at high rates of speed.
While encoding efficiency and high data rates are important, for an encoding and/or decoding system to be practical for use in a wide range of devices, e.g., consumer devices, it is important that the encoders and/or decoders be capable of being implemented at reasonable cost. Accordingly, the ability to efficiently implement encoding/decoding schemes used for error correction and/or detection purposes, e.g., in terms of hardware costs, can be important.
Various types of coding schemes have been used over the years for error correction purposes. One class of codes, generally referred to as xe2x80x9cturbo codesxe2x80x9d were recently invented (1993). Turbo codes offer significant benefits over older coding techniques such as convolutional codes and have found numerous applications.
In conjunction with the advent of turbo codes, there has been increasing interest in another class of related, apparently simpler, codes commonly referred to as low density parity check (LDPC) codes. LDPC codes were actually invented by Gallager some 40 years ago (1961) but have only recently come to the fore. Turbo codes and LDPC codes are coding schemes that are used in the context of so-called iterative coding systems, that is, they are decoded using iterative decoders. Recently, it has been shown that LDPC codes can provide very good error detecting and correcting performance, surpassing or matching that of turbo codes for large codewords, e.g., codeword sizes exceeding approximately 1000 bits, given proper selection of LDPC coding parameters. Moreover, LDPC codes can potentially be decoded at much higher speeds than turbo codes.
In many coding schemes, longer codewords are often more resilient for purposes of error detection and correction due to the coding interaction over a larger number of bits. Thus, the use of long codewords can be beneficial in terms of increasing the ability to detect and correct errors. This is particularly true for turbo codes and LDPC codes. Thus, in many applications the use of long codewords, e.g., codewords exceeding a thousand bits in length, is desirable.
The main difficulty encountered in the adoption of LDPC coding and Turbo coding in the context of long codewords, where the use of such codes offers the most promise, is the complexity of implementing these coding systems. In a practical sense, complexity translates directly into cost of implementation. Both of these coding systems are significantly more complex than traditionally used coding systems such as convolutional codes and Reed-Solomon codes.
Complexity analysis of signal processing algorithms usually focuses on operations counts. When attempting to exploit hardware parallelism in iterative coding systems, especially in the case of LDPC codes, significant complexity arises not from computational requirements but rather from routing requirements. The root of the problem lies in the construction of the codes themselves.
LDPC codes and turbo codes rely on interleaving messages inside an iterative process. In order for the code to perform well, the interleaving must have good mixing properties. This necessitates the implementation of a complex interleaving process.
LDPC codes are well represented by bipartite graphs, often called Tanner graphs, in which one set of nodes, the variable nodes, corresponds to bits of the codeword and the other set of nodes, the constraint nodes, sometimes called check nodes, correspond to the set of parity-check constraints which define the code. Edges in the graph connect variable nodes to constraint nodes. A variable node and a constraint node are said to be neighbors if they are connected by an edge in the graph. For simplicity, we generally assume that a pair of nodes is connected by at most one edge. To each variable node is associated one bit of the codeword. In some cases some of these bits might be punctured or known, as discussed further below.
A bit sequence associated one-to-one with the variable node sequence is a codeword of the code if and only if, for each constraint node, the bits neighboring the constraint (via their association with variable nodes) sum to zero modulo two, i.e., they comprise an even number of ones.
The decoders and decoding algorithms used to decode LDPC codewords operate by exchanging messages within the graph along the edges and updating these messages by performing computations at the nodes based on the incoming messages. Such algorithms will be generally referred to as message passing algorithms. Each variable node in the graph is initially provided with a soft bit, termed a received value, that indicates an estimate of the associated bit""s value as determined by observations from, e.g., the communications channel. Ideally, the estimates for separate bits are statistically independent. This ideal can be, and often is, violated in practice. A collection of received values constitutes a received word. For purposes of this application we may identify the signal observed by, e.g., the receiver in a communications system with the received word.
The number of edges attached to a node, i.e., a variable node or constraint node, is referred to as the degree of the node. A regular graph or code is one for which all variable nodes have the same degree, j say, and all constraint nodes have the same degree, k say. In this case we say that the code is a (j,k) regular code. These were the codes considered originally by Gallager (1961). In contrast to a xe2x80x9cregularxe2x80x9d code, an irregular code has constraint nodes and/or variable nodes of differing degrees. For example, some variable nodes may be of degree 4, others of degree 3 and still others of degree 2.
While irregular codes can be more complicated to represent and/or implement, it has been shown that irregular LDPC codes can provide superior error correction/detection performance when compared to regular LDPC codes.
In order to more precisely describe the decoding process we introduce the notion of a socket in describing LDPC graphs. A socket can be viewed as an association of an edge in the graph to a node in the graph. Each node has one socket for each edge attached to it and the edges are xe2x80x9cplugged intoxe2x80x9d the sockets. Thus, a node of degree d has d sockets attached to it. If the graph has L edges then there are L sockets on the variable node side of the graph, called the variable sockets, and L sockets on the constraint node side of the graph, called the constraint sockets. For identification and ordering purposes, the variable sockets may be enumerated 1, . . . , L so that all variable sockets attached to one variable node appear contiguously. In such a case, if the first three variable nodes have degrees d1, d2, and d3 respectively, then variable sockets 1, . . . , d1 are attached to the first variable node, variable sockets d1+1, . . . , d1+d2 are attached to the second variable node, and variable sockets d1+d2+1, . . . , d1+d2+d3 are attached to the third variable node. Constraint node sockets may be enumerated similarly 1, . . . , L with all constraint sockets attached to one constraint node appearing contiguously. An edge can be viewed as a pairing of sockets, one of each pair coming from each side of the graph. Thus, the edges of the graph represent an interleaver or permutation on the sockets from one side of the graph, e.g., the variable node side, to the other, e.g., the constraint node side. The permutations associated with these systems are often complex, reflecting the complexity of the interleaver as indicated above, requiring complex routing of the message passing for their implementation.
The notion of message passing algorithms implemented on graphs is more general than LDPC decoding. The general view is a graph with nodes exchanging messages along edges in the graph and performing computations based on incoming messages in order to produce outgoing messages.
An exemplary bipartite graph 100 determining a (3,6) regular LDPC code of length ten and rate one-half is shown in FIG. 1. Length ten indicates that there are ten variable nodes V1-V10, each identified with one bit of the codeword X1-X10 (and no puncturing in this case), generally identified by reference numeral 102. Rate one half indicates that there are half as many check nodes as variable nodes, i.e., there are five check nodes C1-C5 identified by reference numeral 106. Rate one half further indicates that the five constraints are linearly independent, as discussed below. Each of the lines 104 represents an edge, e.g., a communication path or connection, between the check nodes and variable nodes to which the line is connected. Each edge identifies two sockets, one variable socket and one constraint socket. Edges can be enumerated according to their variable sockets or their constraint sockets. The variable sockets enumeration corresponds to the edge ordering (top to bottom) as it appears on the variable node side at the point where they are connected to the variable nodes. The constraint sockets enumeration corresponds to the edge ordering (top to bottom) as it appears on the constraint node side at the point they are connected to the constraint nodes. During decoding, messages are passed in both directions along the edges. Thus, as part of the decoding process messages are passed along an edge from a constraint node to a variable node and vice versa.
While FIG. 1 illustrates the graph associated with a code of length 10, it can be appreciated that representing the graph for a codeword of length 1000 would be 100 times more complicated.
An alternative to using a graph to represent codes is to use a matrix representation such as that shown in FIG. 2. In the matrix representation of a code, the matrix H 202, commonly referred to as the parity check matrix, includes the relevant edge connection, variable node and constraint node information. In the matrix H, each column corresponds to one of the variable nodes while each row corresponds to one of the column nodes. Since there are 10 variable nodes and 5 constraint nodes in the exemplary code, the matrix H includes 10 columns and 5 rows. The entry of the matrix corresponding to a particular variable node and a particular constraint node is set to 1 if an edge is present in the graph, i.e., if the two nodes are neighbors, otherwise it is set to 0. For example, since variable node V1 is connected to constraint node C1 by an edge, a one is located in the uppermost lefthand corner of the matrix 202. However, variable node V4 is not connected to constraint node C1 so a 0 is positioned in the fourth position of the first row of matrix 202 indicating that the corresponding variable and constraint nodes are not connected. We say that the constraints are linearly independent if the rows of H are linearly independent vectors over GF[2] (a Galois field of order 2). Enumerating edges by sockets, variable or constraint, corresponds to enumerating the 1""s in H. Variable socket enumeration corresponds to enumerating top to bottom within columns and proceeding left to right from column to column, as shown in matrix 208. Constraint socket enumeration corresponds to enumerating left to right across rows and proceeding top to bottom from row to row, as shown in matrix 210.
In the case of a matrix representation, the codeword X which is to be transmitted can be represented as a vector 206 which includes the bits X1-Xn of the codeword to be processed. A bit sequence X1-Xn is a codeword if and only if the product of the matrix 206 and 202 is equal to zero, that is: Hx=0.
In the context of discussing codewords associated to LDPC graphs, it should be appreciated that in some cases the codeword may be punctured. Puncturing is the act of removing bits from a codeword to yield, in effect, a shorter codeword. In the case of LDPC graphs this means that some of the variable nodes in the graph correspond to bits that are not actually transmitted. These variable nodes and the bits associated with them are often referred to as state variables. When puncturing is used, the decoder can be used to reconstruct the portion of the codeword which is not physically communicated over a communications channel. Where a punctured codeword is transmitted the receiving device may initially populate the missing received word values (bits) with ones or zeros assigned, e.g., in an arbitrary fashion, together with an indication (soft bit) that these values are completely unreliable, i.e., that these values are erased. For purposes of explaining the invention, we shall assume that, when used, these receiver-populated values are part of the received word which is to be processed.
Consider for example the system 350 shown in FIG. 3. The system 350 includes an encoder 352, a decoder 357 and a communication channel 356. The encoder 350 includes an encoding circuit 353 that processes the input data A to produce a codeword X. The codeword X includes, for the purposes of error detection and/or correction, some redundancy. The codeword X may be transmitted over the communications channel. Alternatively, the codeword X can be divided via a data selection device 354 into first and second portions Xxe2x80x2, Xxe2x80x3 respectively by some data selection technique. One of the codeword portions, e.g., the first portion Xxe2x80x2, may then be transmitted over the communications channel to a receiver including decoder 357 while the second portion Xxe2x80x3 is punctured. As a result of distortions produced by the communications channel 356, portions of the transmitted codeword may be lost or corrupted. From the decoder""s perspective, punctured bits may be interpreted as lost.
At the receiver soft bits are inserted into the received word to take the place of lost or punctured bits. The inserted indicating erasure of Xxe2x80x3 soft bits indicate and/or bits lost in transmission.
The decoder 357 will attempt to reconstruct the full codeword X from the received word Y and any inserted soft bits, and then perform a data decoding operation to produce A from the reconstructed codeword X.
The decoder 357 includes a channel decoder 358 for reconstructing the complete codeword X from the received word Y. In addition it includes a data decoder 359 for removing the redundant information included in the codeword to produce the original input data A from the reconstructed codeword X.
It will be appreciated that received words generated in conjunction with LDPC coding, can be processed by performing LDPC decoding operations thereon, e.g., error correction and detection operations, to generate a reconstructed version of the original codeword. The reconstructed codeword can then be subject to data decoding to recover the original data that was coded. The data decoding process may be, e.g., simply selecting a specific subset of the bits from the reconstructed codeword.
LDPC decoding operations generally comprise message passing algorithms. There are many potentially useful message passing algorithms and the use of such algorithms is not limited to LDPC decoding. The current invention can be applied in the context of virtually any such message passing algorithm and therefore can be used in various message passing systems of which LDPC decoders are but one example.
For completeness we will give a brief mathematical description of one realization of one of the best known message passing algorithms, known as belief propagation.
Belief propagation for (binary) LDPC codes can be expressed as follows. Messages transmitted along the edges of the graph are interpreted as log-likelihoods   log  ⁢      xe2x80x83    ⁢            p      0              p      1      
for the bit associated to the variable node. Here, (p0,p1) represents a conditional probability distribution on the associated bit. The soft bits provided to the decoder by the receiver are also given in the form of a log-likelihood. Thus, the received values, i.e., the elements of the received word, are log-likelihoods of the associated bits conditioned on the observation of the bits provided by the communication channel. In general, a message m represents the log-likelihood m and a received value y represents the log-likelihood y. For punctured bits the received value y is set to 0, indicating p0=p1=xc2xd.
Let us consider the message-passing rules of belief propagation. Messages are denoted by mC2V for messages from check nodes to variable nodes and by mV2C for messages from variable nodes to check nodes. Consider a variable node with d edges. For each edge j=1, . . . , d let mC2V(i) denote the incoming message on edge i. At the very beginning of the decoding process we set mC2V=0 for every edge. Then, outgoing messages are given by             m      V2C        ⁢          (      j      )        =      y    +                  ∑                  i          =          1                d            ⁢                        m          C2V                ⁢                  (          i          )                      -                            m          C2V                ⁢                  (          j          )                    .      
At the check nodes it is more convenient to represent the messages using their xe2x80x98signxe2x80x99 and magnitudes. Thus, for a message m let mp∈GF[2] denote the xe2x80x98parityxe2x80x99 of the message, i.e., mp=0 if mxe2x89xa70 and mp=1 if m less than 0. Additionally let mr∈[0,∞] denote the magnitude of m. Thus, we have m=xe2x88x921mpmr. At the check node the updates for mp and mr are separate. We have, for a check node of degree d,                     m        p        C2V            ⁢              (        j        )              =                  (                              ∑                          i              =              1                        d                    ⁢                                    m              p              V2C                        ⁢                          (              i              )                                      )            -                        m          p          V2C                ⁢                  (          j          )                      ,
where all addition is over GF[2], and                     m        r        C2V            ⁢              (        j        )              =                  F                  -          1                    ⁢              (                              (                                          ∑                                  i                  =                  1                                d                            ⁢                              F                ⁢                                  (                                                            m                      r                      V2C                                        ⁢                                          (                      i                      )                                                        )                                                      )                    -                      F            ⁢                          (                                                m                  r                  V2C                                ⁢                                  (                  j                  )                                            )                                      )              ,
where we define F(x):=log coth (x/2). (In both of the above equations the superscript V2C denotes the incoming messages at the check node.) We note that F is its own inverse, i.e., Fxe2x88x921(x)=F(x).
Most message passing algorithms can be viewed as approximations to belief propagation. It will be appreciated that in any practical digital implementation messages will be comprised of a finite number of bits and the message update rules suitably adapted.
It should be apparent that the complexity associated with representing LDPC codes for large codewords is daunting, at least for hardware implementations trying to exploit parallelism. In addition, it can be difficult to implement message passing in a manner that can support processing at high speeds.
In order to make the use of LDPC codes more practical, there is a need for methods of representing LDPC codes corresponding to large codewords in an efficient and compact manner thereby reducing the amount of information required to represent the code, i.e., to describe the associated graph. In addition, there is a need for techniques which will allow the message passing associated with multiple nodes and multiple edges, e.g., four or more nodes or edges, to be performed in parallel in an easily controlled manner, thereby allowing even large codewords to be efficiently decoded in a reasonable amount of time. There is further need for a decoder architecture that is flexible enough to decode several different LDPC codes. This is because many applications require codes of different lengths and rates. Even more desirable is an architecture that allows the specification of the particular LDPC code to be programmable.