Coding is widely used to decrease bit and packet error probabilities in the transmission of digital information. In many applications, convolutional codes are used for this purpose. A convolutional code is defined by a state machine with state transitions determined by the input bits to be encoded. The two most common decoding algorithms for convolutional codes or codes based on convolutional codes are the Viterbi algorithm and the maximum a-posteriori probability (MAP) algorithm.
For applications requiring greater error correcting capability, turbo codes or LDPC codes are often used. The decoding of concatenated codes, such as turbo codes, requires the result of decoding one code trellis as input for decoding the next code trellis, and so on for subsequent code trellises. The iterative nature of the decoding algorithms for turbo codes present significant implementation challenges.
LDPC codes are promising next-generation error-correcting codes, which can outperform Turbo codes in terms of coding gain. A number of techniques have been proposed or suggested for parallel or serial decoding of LDPC codes. A. J. Blanksby and C. J. Howland, “A 690-mW 1-Gb/s 1024-b, Rate-1/2 Low-Density Parity-Check Decoder,” IEEE J. Solid-State Circuits, Vol. 37, 404-412 (March 2002) and U.S. Pat. No. 6,539,367, to Blanksby et al., for example, describe block-parallel decoding of Low Density Parity Check (LDPC) codes.
A serial implementation of an LDPC decoder may share some hardware and thus require a smaller number of computational elements and a less complicated routing mechanism compared to a parallel decoding method. The serial method, however, typically requires additional memory since all the messages along the edges of a bipartite graph need to be saved, and the throughput is limited since a serial implementation takes a larger number of clock cycles to decode a block of code. Generally, a bipartite graph is a graphical representation of a parity check matrix that is used by the LDPC code. The typical memory requirement is proportional to twice the number of edges in the code and the number of cycles is typically proportional to the sum of number of bit nodes and check nodes in the code.
Zining Wu and Gregory Burd, for example, have proposed a serial architecture for a communications system where an LDPC decoder is concatenated with a channel detector. The disclosed architecture reduces the memory requirement relative to a conventional serial implementation. See, Z. Wu and G. Burd, “Equation Based LDPC Decoder for Intersymbol Interference Channels,” IEEE Proc. of Acoustics, Speech, and Signal Processing 2005 (ICASSP '05) Vol. 5, 757-60 (Mar. 18-23, 2005). D. E. Hocevar proposes a serial architecture for stand-alone LDPC decoders that also reduces the memory requirement relative to a conventional serial implementation. See, D. E. Hocevar, “LDPC Code Construction With Flexible Hardware Implementation,” IEEE Int'l Conf. on Comm. (ICC), Anchorage, Ak., 2708-2712 (May, 2003).
A need still exists for a serial LDPC decoding architecture, both for stand-alone and concatenated LDPC decoders, that exhibits further improvements in the memory requirement or number of cycles per iteration (or both) and in bit error rate performance, relative to both parallel and serial implementations.