This invention relates to a method and apparatus for efficient iterative decoding of a turbo-encoded data channel.
In many applications, data—e.g., on a communication channel or in the read channel of a data storage device—is encoded using an outer code. Examples of such codes include turbo codes, Low-Density Parity Check (LDPC) codes, and convolutional codes. Encoded data from an outer code are often interleaved before being transmitted over a data channel. In that data channel, the signal might become corrupted with noise or defects. On the receiver side, the received signal can be decoded using an iterative decoding principle often referred to as turbo decoding. A feature of turbo decoding is that decoding includes multiple stages, each of which includes a detection/equalization block and an outer decoder block. For example, the signal from a detector front end, which may be a finite impulse response (FIR) filter, may be processed by a soft detector such as a Soft Output Viterbi Algorithm (SOVA).
The soft detector provides two outputs—(i) hard decisions for the detected signal and (ii) extrinsic log-likelihood ratios (LLRs), which indicate new reliability information generated by the detector for each of the hard decisions. These LLRs are then de-interleaved and passed to the outer decoder for further processing. The outer soft decoder then provides its own hard decisions as well as new extrinsic LLRs. These LLRs from the outer decoder are then passed to the soft detector as a priori LLRs after interleaving. In the next round of iterative decoding, the soft detector generates new extrinsic LLRs, taking both the a priori LLRs and the FIR signal as inputs. For the first iteration, the a priori LLR inputs to the soft detector are all set to zero. This iterative decoding between soft detector and the outer decoder is carried out until a maximum number of iterations are reached, or a valid code word is found.
In a known arrangement, each sector of a disk drive may be decoded using three SOVAs, each of which generates two LLRs during each clock cycle. This results in six LLRs per clock cycle. These are interleaved using a global interleaver. However, a global interleaver has high complexity, with high memory and computation requirements. The entire sector must be interleaved before any data can be returned. This requires buffering the entire sector, increasing latency. Moreover, a separate global de-interleaver, with similar memory requirements, also is needed.