1. Technical Field
The embodiments herein generally relate to Viterbi decoders, and, more particularly, to techniques for improving the performance and speed of Viterbi and Trellis Coded Modulation (TCM) Decoding for Multi-Standard support.
2. Description of the Related Art
Typical DTV demodulation schemes require either a Viterbi decoder or a Trellis Coded Modulated (TCM) decoder for implementing channel decoding functionality. Viterbi decoding and Trellis Coded Modulation (TCM) decoding can be done by implementing Viterbi decode algorithm. The Viterbi decode algorithm may use either bit serial method or block based method for decoding. The bit serial method has a limitation in which for decoding each bit, the trellis trace-back requirement is 5*K (K is constraint length of the convolution code) without puncturing.
The number of cycles for decoding each bit directly depends on the constraint length K. In addition, the computation and the number of cycles required is extremely high to support the required data rate. As compared to the bit serial method, the block based method (e.g., particularly Sliding Block Decoding algorithm) allows for a high degree of concurrent processing of independent blocks (e.g., of input channel symbols) if the operations are pipelined. The number of bits decoded is higher (e.g., a block of 2*L bits are decoded for every 4*L channel symbols, where L is a maximum of 5*K without depuncturing) than the bit serial method and increases the throughput.
The Viterbi decoding requires Branch Metric computation (BM), Path Metric (or State Metric—SM) computation and Trace-back and decode bit generation. The BM or SM values can be computed using either radix-2 or radix-4 architecture. In radix-4 architecture, there is a four-fold increase in the throughput because the number of trellis iterations covered, the number of states per iteration covered and the decoding rate is twice in radix-4 compared to radix-2. The radix-4 structure is used to meet the required data rate, since it offers the best gains in decoding rate, even though the hardware requirement of radix-4 architecture is comparatively more than the radix-2.
For a radix-4 architecture, the computation requirement of DVB-T Viterbi decoding is number of BM computations is 210, the number of SM computations is 1120 and the number of trace-back computations is 280 for L=35. But, with the total number of cycles available to decode 2 L bits being 12 L cycles, approximately 24 operations are required to be completed in 6 cycles to sustain the maximum data rate. This high compute requirement can be achieved only with multiple execution units or a CPU, which can run at higher speeds. However, due to the sequential mode of data flow even higher speeds are insufficient to get higher decoding rates.
The best CPU utilization can be achieved only when all the operations of Branch Metric Computation, Path Metric Computation and Trace-back are segregated on three different execution units and their relative operations are pipelined. The conventional DTV channel decoding schemes is not scalable and not flexible to implement Viterbi decoding and Trellis Coded Modulated (TCM) decoding and does not support a multitude of code rates, different encoding polynomials, puncturing patterns while at the same time supporting high data rates.