In many communication systems, the information signals are required to be transmitted at a very-high rate. Among other factors, one of the main limitations for the maximum achievable rate of signal transmission is the hardware feasibility of the decoder. in tact, very high-rate decoders are often very complex and expensive. Thus, there is a need for efficient low cost decoder implementations.
In modern digital communications the requirement of massive computation for high-speed signal processing at low cost can be accomplished by using VLSI special-purpose computers, whose architecture is determined by the algorithm to be implemented. A good architecture should also provide a linear-scale solution, i.e. a solution where the hardware complexity grows linearly with the throughput requirement.
When very high speed signal processing is to be dealt with, the choice of the algorithms is of crucial importance. Suitable algorithms have a high degree of parallelism and pipeline ability. Moreover, block-oriented algorithms naturally provide a linear-scale solution.
For VLSI implementation of parallel and pipelined algorithms, systolic array processors have been proposed. The two key parameters which define the processing speed of a systolic array are its block pipelining period .beta. and is clock period t.sub.c. The former is defined as the number of time units between the beginning of two subsequent computation tasks. The latter is the basic time unit of the systolic array and it depends on the maximum propagation delay through a chain of gates in the processor. Clearly t.sub.c depends also on the single gate delay: when different VLSI technologies are considered (for example 2.mu. CMOS, 1.mu. CMOS, etc.) t.sub.c turns out to be scaled by a factor depending on the technology.
The systolic implementation of a block-oriented algorithm is known as Staged Decoding. This is a suboptimal general procedure for decoding a class of signal space codes and lattices obtained through generalized concatenation construction (E. Biglieri, A. Spalvieri: Generalised Concatenation: A Tutorial, In E. Biglieri and M. Luise eds: Coded Modulation and Bandwidth-Efficient Transmission, Elsevier, 1992).
This procedure has already been applied to the implementation of a staged decoder for BCM (Block Coded Modulation) signals (G. Caire, J. Ventura, J. Murphy, S. Y. Kung, "VLSI Systolic Array Implementation of a Staged Decoder for BCM Signals", Proceedings of the IEEE Workshop on VLSI signal processing, Napa Calif., USA, Oct. 28-30, 1992).
This known implementation method applied to block coded modulation (BCM) schemes relies on block-level pipelining. Thus, the winner codes are basically selected by computing the decoding metric for each possible legal codeword in the code family at each stage of the decoder. This implies that the maximum achievable rate is basically limited by the size of the larger code in the code structure and therefore decoders of complex code structures are fundamentally very slow. Specifically, a maximum rate of 100 Mbps has been reported with an associated hardware complexity of 38 Kgates.