This invention relates generally to digital transmission systems and, in particular, to Maximum Posteriori Probability (MAP) and similar decoders that use a sliding window technique.
Of most interest to this invention are digital transmission systems where a received signal from a channel is a sequence of wave forms whose correlation extends well beyond T, the signaling period. There can be many reasons for this correlation, such as coding, intersymbol interference, or correlated fading. It is known that an optimum receiver in such situations cannot perform its decisions on a symbol-by-symbol basis, so that deciding on a particular information symbol Uk involves processing a portion of the received signal Td seconds long, with Td greater than T. The decision rule can be either optimum with respect to a sequence of symbols, or with respect to the individual symbol, Uk.
Reference can be had to a journal article entitled xe2x80x9cSoft-Output Decoding Algorithms in Iterative Decoding of Turbo Codesxe2x80x9d, S. Benedetto et al., TDA Progress Report 42-124, pp. 63-87, Feb. 15, 1996 (incorporated by reference herein), wherein the authors report that the most widely applied algorithm for the first kind of decision rule is known as the Viterbi algorithm. In an optimum formulation of the Viterbi algorithm it is required to wait for decisions until an entire sequence has been received. In practical implementations, this drawback is overcome by anticipating decisions (single or in batches) on a regular basis with a fixed delay, D. A choice of D five to six times the memory of the received data is widely recognized as presenting a good compromise between performance, complexity, and decision delay.
Optimum symbol decision algorithms base their decisions on the maximum a posteriori probability (MAP). This class of algorithms has been known for several decades, although they are generally much less popular than the Viterbi algorithm, and are not commonly applied in practical systems. One reason for this is that the MAP algorithms typically yield performance in terms of symbol error probability that is only slightly superior to the Viterbi algorithm, yet they present a much higher complexity. Recently, however, interest in these algorithms has increased in connection with the problem of decoding concatenated coding schemes.
Concatenated coding schemes (a class which may include product codes, multilevel codes, generalized concatenated codes, and serial and parallel concatenated codes) were first proposed by Forney (G. D. Forney, Jr., xe2x80x9cConcatenated Codesxe2x80x9d, Cambridge, Mass., MIT, 1966) as a means of achieving large coding gains by combining two or more relatively simply xe2x80x9cconstituentxe2x80x9d codes. The resulting concatenated coding scheme is a powerful code endowed with a structure that facilitates decoding, such as by using so-called stage decoding or iterated stage decoding.
In order to function properly these decoding algorithms cannot limit themselves to simply passing the symbols decoded by an inner decoder to an outer decoder. Instead they need to exchange some kind of soft information. As was proved by Forney, an optimum output of the inner decoder should be in the form of the sequence of the probability distributions over the inner code alphabet conditioned on the received signal, the a posteriori probability (APP) distribution. There have been several attempts to achieve, or at least approach, this goal. Some of these approaches are based on modifications of the Viterbi algorithm so as to obtain at the decoder output some reliability information, in addition to the xe2x80x9chardxe2x80x9d-decoded symbols. This has led to the concept of an xe2x80x9caugmented-output,xe2x80x9d or the list-decoding Viterbi algorithm, and to the soft-output Viterbi algorithm (SOVA). However, these approaches can be suboptimal, as they are unable to supply the required APP. A different approach employs the original symbol MAP decoding algorithms with the aim of simplifying them to a form suitable for implementation. Of particular interest are soft-decoding algorithms as a main building block of an iterative stage decoding of parallel concatenated codes. These algorithms are particularly interesting since the advent of the so-called turbo codes (C. Berrou et al., Near Shannon Limit Error-Correcting Coding and Decoding: Turbo Codesxe2x80x9d, Proceedings of ICC""93, Geneva, pp. 1064-1070, May 1993). These codes are parallel concatenated convolutional codes (PCCC) whose encoder is formed by two (or more) constituent systematic encoders joined through an interleaver. In this approach the input information bits feed the first encoder and, after having been interleaved by the interleaver, enter the second encoder. The codeword of the parallel concatenated code includes the input bits to the first encoder followed by the parity check bits of both encoders. Generalizations to more than one interleaver and two parallel concatenated convolutional codes are possible.
The suboptimal iterative decoder is modular and contains a number of equal component blocks formed by concatenating soft decoders of the constituent codes (CC), separated by the interleavers used at the encoder side. By increasing the number of decoding modules and, thus, the number of decoding iterations, bit-error probabilities as low as 10xe2x88x925 at Eb/No=0.0 dB for rate 1/4 PCCC have been shown by simulation. A version of turbo codes employing two eight-state convolutional codes as constituent codes, an interleaver of 32xc3x9732 bits, and an iterative decoder performing two and one-half iterations with a complexity of the order of five times the maximum-likelihood (ML) Viterbi decoding of each constituent code is presently available on a chip, yielding a measured bit-error probability of 0.9xc3x9710xe2x88x926 at Eb/No=3 dB. Upper bounds to the ML bit-error probability of PCCCs have been proposed. As a by-product, it has been shown by simulation that iterative decoding can approach quite closely the ML performance. The iterative decoding algorithm is a simplification whose regular steps and limited complexity seem quite suitable to very large-scale integration (VLSI) implementation. Simplified versions of the algorithm have been proposed and analyzed in the context of a block decoding strategy that requires trellis termination after each block of bits. A similar simplification has been used for a hardware implementation of the MAP algorithm.
In an article entitled xe2x80x9cMultiple Output Sliding Window Decoding Algorithm for Turbo Codesxe2x80x9d, Proc. CISS 1996, Baltimore Md., pp. 515-520 (incorporated by reference herein), J. Yuan et al. report on a previous sliding window MAP decoding algorithm (SW-BCJR, where BCJR indicates the original authors), in which the decoder operates on a fixed memory span and APP outputs are forced with a given delay. This approach is said, however, to dramatically increase the required computation relative to the non-windowed MAP algorithm.
More particularly, these authors show that the non-windowed MAP algorithm requires that the entire sequence must be received before starting the decoding process. In order to avoid this delay, and to reduce memory requirements, the sliding window approach operates with a fixed memory span D, which is small compared to the frame or block size. While saving both memory and delay, it is also shown that the sliding window MAP algorithm must, at every time index k, in order to obtain the probability of the state of the encoder at state Si, conditioned on future received symbols (Bk(Si)), recursively backwards compute for the entire sliding window length, as opposed to once for the entire frame length for the non-windowed MAP algorithm. In this regard a term xc3xa1k(Si) is the probability of the state Si of the encoder at time k conditioned on the past and current received symbols, xcex1k(Si,uk) is the a posterior transition probability from state Si at time k and encoder data input uk, and xcex93k(x) is the joint probability of the channel input symbol xk and channel output symbol yk.
In order to reduce this computational load. J. Yuan et al. propose a multiple siding window MAP algorithm wherein the window is advanced multiple stages at each cycle, thereby sliding the window fewer times and reducing the number of computations.
While overcoming certain of the deficiencies of the non-windowed and windowed MAP algorithms, the multiple output sliding window approach of J. Yuan et al. still does not provide an optimum solution to the efficient decoding of turbo and other similar coded data.
It is a first object and advantage of this invention to provide an improved windowed MAP decoding technique that overcomes the foregoing and other problems.
It is another object and advantage of this invention to provide a simplified sliding windowed MAP decoder hardware architecture that overcomes the foregoing and other problems.
The foregoing and other problems are overcome and the objects of the invention are realized by methods and apparatus in accordance with embodiments of this invention.
A block sliding window data decoder includes a forward recursion calculator and a plurality of backward recursion calculators, a memory, and a symbol probability likelihood calculator. A first backward recursion calculator (B1) and a second backward recursion calculator (B2) are both initialized every D cycles, where D is the sliding window size. With the modified sliding window approach, two backward iteration calculations of depth D are performed every D cycles, during which D information bits are decoded. The first backward recursion calculator, which may also be referred to as a xe2x80x9cfront runnerxe2x80x9d, operates every D cycles to perform a backward recursion over the most recently received input signals, with the recursion initialized with all equal likelihood states. At the end of D recursions the final result of the calculation from the front runner backward recursion calculator is loaded into the second backward recursion calculator as the initial conditions for the second backward recursion calculator. This occurs once every D cycles. However, every value calculated by the second backward recursion calculator is provided to the symbol likelihood probability calculator. The forward recursion calculator is initialized only once during decoding of each block of information and generates the necessary data. The symbol likelihood probability calculator receives inputs from the forward recursion calculator and from the second backward recursion calculator, and from the memory during every cycle.
The memory of the decoder stores the input signals and is organized as M cells, where M is equal to four in a presently preferred, but not limiting, embodiment of this invention. For each cycle one of the four cells is written while Mxe2x88x921 or three of the cells are read and their contents provided to the forward recursion calculator, the first and second backward recursion calculators, and to the symbol probability likelihood calculator. During each D cycle period one of the four memory cells is written while the remaining three cells are read. This process continues every D cycles.
The forward and backward recursion calculators of the decoder each include a normalizer for normalizing the signals processed by these units. The normalizer is simplified as compared to the prior art and is implemented, preferably, with logical AND functions.