An error correcting decoder is typically implemented, e.g., in a network system, to reduce communication errors. One type of an error correcting decoder is an iterative error correcting decoder. Iterative error correcting decoders typically use a large-scale parallel network of nodes performing soft probability calculation. These nodes exchange probability information of a received data block among one another. After a certain number of iterations within an iterative decoder structure, individual noisy information in a data block (or codeword) is transformed into an estimate of the codeword as a whole—i.e., the probabilities associated with received bit values iterate between these two node functions to finally resolve the most probable value of each data bit. Examples of iterative decoders are the low density parity check (LDPC) decoders, Hamming decoders, Turbo decoders, and the like.
The structure of an iterative error correcting decoder can be represented graphically by a factor graph. Factor graphs are the graphical representation of the linear space of codewords (e.g., LDPC codewords). A factor graph consists of nodes and edges, where the edges are simply the wire connections between the nodes, while a node represents a function of its inputs. For example, in a low density parity check (LDPC) factor graph, there are two types of nodes representing two distinct functions—i.e., “equality constraint” nodes and “parity check” nodes. According to the IEEE 802.3ae (10 GBASE-T) standard, the proposed LDPC decoder consists of (2048) equality constraint nodes and (384) parity check nodes. Each equality constraint node has (6) bidirectional connections to corresponding parity constraint nodes and each parity check node has a total of (32) bidirectional connections to corresponding equality constraint nodes. This results in a factor graph with a network matrix of (12,228) bidirectional connections, where each connection consists of two sets of wires having an N-bit width. For example, in a parallel LDPC iterative decoder with a message resolution of 8 bits, the decoder would contain a total of 196,608 wires.
LDPC code is specified by a parity check matrix (which is commonly referred to as an H matrix) having a very few number of “ones” per row. An example of an H matrix 100 is shown in FIG. 1. The length of each codeword is equal to the number of columns in the H matrix 100. In one example, each codeword is created such that the parity of each set of bits corresponding to the “ones” in a row is even. The number of rows corresponds to the number of parity checks that the codeword must satisfy. Therefore, if all errors in a received codeword are corrected by the decoder, all parity checks must be satisfied for the output codeword.
An important feature of one implementation of an iterative decoder is the number of iterations that the iterative decoder can perform on an input codeword in a given amount of time as it relates to the bit error rate (BER) of the iterative decoder. A higher number of iterations results in a better BER performance of an iterative decoder. Therefore, to maximize the performance of a single iterative decoder, it is generally preferred to have the iterative decoder perform a higher number of iterations (which affects the BER performance of a given iterative decoder). Therefore, parallel processing of the data is the clear way to increase the number of iterations in such decoders. For example parallel LDPC architectures have a notable speed advantage to their serial counterpart, at the price of higher number of processing cells and complexity. At the same time, as discussed above, the performance of an iterative decoder is limited by the resolution of the probability messages. An iterative decoder with low-bit resolution requires more iterations to attempt to deliver the same performance (as compared to an iterative decoder with a high-bit resolution) and will usually hit error floors preventing the iterative decoder from achieving the same BER, meaning with an increased signal-to-noise ratio (SNR) insignificant BER improvement is achieved. FIG. 2 shows simulations demonstrating the error floor as a result of finite message resolution. However, passing messages at a high rate and a high resolution between the nodes is very expensive in terms of area and power.
Thus, passing messages between the nodes in an LDPC iterative decoder having a parallel architecture requires a substantial amount of power as the number of wires in such an implementation is extremely high (˜200K in 10 GBase-T code) and an average length of the wires is very long (estimated to be 6-8 mm in 10 GBase-T code (for, e.g., 90 nm technology)). In other architectures of an LDPC iterative decoder, such as serial or parallel-serial message processing architectures, the overall power consumption is higher although the length and size of wires are smaller. This is mainly due to the replacement of the wire matrix of connections with registers that need to run at a speed that is a multiple of that required by the parallel architecture. Although the amount of logic reduces, designing the logic for higher speed, translates to higher power due to use of high drive logic cells, more pipelining, more buffering, and also higher leakage devices.
One of the important features of an iterative decoder is that as the decoder settles to the final corrected word, the average rate at which the messages change reduces significantly. This feature helps significantly reduce the power consumption in the connection matrix of a parallel LDPC iterative decoder. This is because each message has its own dedicated N-bit wire in each direction and digital power is only consumed for those few bits that toggle as the message value is settled. Conventional serial and parallel-serial constructions do not enjoy this benefit as same set of N-bit wire is multiplexed to carry bits from different nodes.