Convolutional codes, used for encoding data for transmission or for storage, are used in high-performance digital communication systems, such as cellular telephone systems, and high areal data density magnetic mass-storage systems, such as hard-disk drives. Recovery of the encoded data after transmission or from a magnetic disk system falls to a type of decoder that implements a form of the Viterbi algorithm (VA), referred to generally as a Viterbi decoder (VD). The Viterbi decoder is a complex device that, without high-density very large-scale integrated circuit (VLSI) technology to implement the Viterbi decoder, modern digital cellular telephones, and battery operated computers and mp3 players with hard-disk drives would not be practical. For a detailed description of the Viterbi algorithm, see “Viterbi Algorithm,” by G. Forney, Jr., Proceedings of the IEEE, vol. 61, no. 3, pp. 268-278, March 1973, hereby incorporated by reference in its entirety.
VDs are also widely used to detect data in the presence of intersymbol interference (ISI), such as in mass-storage systems and bandwidth-limited high-speed communication channels. See “Maximum-Likelihood Sequence Estimation of Digital Sequences in the Presence of Intersymbol Interference,” by G. Forney, Jr., IEEE Transactions on Information Theory, Vol. IT-18, No. 3, pp. 363-378, May 1972, hereby incorporated by reference in its entirety.
There are two basic forms of the VA: 1) trace-back (TB) and 2) register-exchange (RE). Both algorithms produce “decoded” data based on a probabilistic estimation of received data symbols by knowing a priori the convolution code used to encode the data. The TB version, which retraces the data estimates back in time to find the most likely sequence (path) of encoding for a given received data symbol, allows for small, power efficient VD implementations at the cost of slow speed. The RE version (referred to herein as the RE architecture), which processes a predetermined number of data estimates in parallel such that the estimates merge to a most likely value, is the fastest, least latent, VD implementation. The RE architecture uses commonly clocked flip-flop registers instead of area-efficient random-access memories. Concomitant with the low latency is high power dissipation because all the registers are clocked simultaneously with each clock cycle. It is understood that, for purposes here, the foregoing descriptions of the various forms of VA and the implementations thereof are greatly simplified. For a more detailed description of the TB and RE forms of the VA, see “A 500-Mb/s Soft-Output Viterbi Decoder,” by Yeo et al., IEEE Journal of Solid-State Circuits, Vol. 38, No. 7, pp. 1234-1241, July 2003, and “High-Speed VLSI Architectures for Soft-Output Viterbi Decoding,” by O. Joeressen et al., International Conference on Application Specific Array Processors, pp. 373-384, 1992, both of which are hereby incorporated by reference in their entirety.
For many low-power applications, a VD implementing the TB algorithm cannot tolerate the long latency inherent in the algorithm. It is therefore desirable to provide a VD implementing the RE algorithm but with lower power dissipation.