A typical communications system 11 which utilizes convolutional coding for error control is illustrated in FIG. 1. At the transmitting end of the communications system 11, a data stream x (or digital information sequence; boldface text denotes a vector herein), as designated at reference arrow 12, is received and encoded by a convolutional encoding system 13 to form convolutional code words y, designated at reference arrow 14. The convolutional code words y are then transmitted across a communications channel 16, which is generally noisy. The noise corrupts the convolutional code words y, and corrupted vectors r are produced at reference arrow 17 at the receiving end of the communications system 11. At the receiving end, a convolutional decoding system 18 receives the vectors r and generates a convolutionally-decoded data stream x', as indicated at reference arrow 21, which is an estimate of the original data stream x.
Convolutional codes are well known in the art and were first presented in 1955. For a detailed discussion of convolutional coding, see S. B. Wicker, Error Control Systems for Digital Communication and Storage, Chapter 11, Englewood Cliffs: Prentice Hall, 1994, which is a textbook written by one of the inventors herein. In general, convolutional coding for error control involves introducing redundancy into a data stream through the use of memory elements, for example, a linear shift register or like apparatus, in order to determine and minimize error in a data transmission.
A typical convolutional encoding system 13 having a rate of 1/2 is shown in FIG. 2. The rate of this encoding system 13 is established by the fact that the encoding system 13 outputs 2 bits 22a, 22b, for every input bit 24. In general, an encoding system 13 with k inputs and n outputs is said to have rate k/n. In FIG. 2, the binary input data stream x=(x.sub.0,x.sub.1,x.sub.2, . . . ) is fed into a shift register 26 having a series of memory elements 28a, 28b. With each successive input 24 to the shift register 26, the values of the memory elements 28a, 28b are tapped off, as indicated by reference arrows in FIG. 2, and added via adders 32, 34 according to a fixed pattern. This operation creates a pair of outputs 22a, 22b, which are essentially coded data stream y.sup.(0) =(y.sub.0.sup.(0),y.sub.1.sup.(0),y.sub.2.sup.(0), . . . ) and y.sup.(1) =(y.sub.0.sup.(1),y.sub.1.sup.(1),y.sub.2.sup.(1), . . . ). These output data streams 22a, 22b are typically multiplexed together to create a single coded data stream y=(y.sub.0.sup.(0),y.sub.0.sup.(1),y.sub.1.sup.(0),y.sub.1.sup.(1),y.sub.2 .sup.(0),y.sub.2.sup.(1), . . . ), which is commonly referred to as a convolutional code word.
A trellis diagram 35 representative of the convolutional encoding system 13 is illustrated in FIG. 3. The concept of the trellis diagram is well known in the art and is utilized in order to analyze system state changes. In essence, a trellis diagram is a state diagram which explicitly shows passage of time. The memory elements 28a,28b in the encoding system 13 of FIG. 1 can exhibit, at any given time, one of the states S.sub.0, S.sub.1, S.sub.2, S.sub.3, or 00, 10, 01, 11, respectively. The trellis diagram 35 of FIG. 3 shows all possible system states at the nodes of the diagram and shows all possible state transitions by the branches of the trellis diagram 35. The branches of the trellis diagram 35 are labelled with the output bits corresponding to the associated state transitions.
In general, encoding systems are simple to construct and implement. However, the difficulty in practicing convolutional coding in the communications system 11, as set forth in FIG. 1, involves the design of the convolutional decoding system 18. In recent years, efforts have been made to design convolutional decoding systems 18 which implement an artificial neural network (ANN). ANN's have been successfully applied in the fields of signal processing and pattern recognition. Although the general decoding problem can be viewed as a form of pattern recognition, it possesses some distinctive features that substantially complicate the design process. First, the information to be decoded in a single code word is far more extensive than that involved with the recognition of a pattern. For example, a typical pattern recognition problem might require the identification of one of eight patterns in a 12.times.10 binary image, as is described in R. P. Lippmann, "An Introduction to Computing with Neural Nets", IEEE ASSP Magazine, pages 4-22, April, 1987. This requires the consideration of 8 out of 2.sup.120 possibilities. In contrast, a rate - 1/2 convolutional code has 2.sup.60 120-bit code words that must be considered. Second, conventional pattern recognition problems have more arbitrary pattern distributions than decoding problems. In most cases, the code words form a vector space over a finite field. The algebraic properties of practical block codes and convolutional codes may introduce additional structure.
The foregoing features demand consideration in the implementation of ANN convolutional decoding systems 18. One may draw the following conclusions. First, network training is not likely to be a successful design tool. For neural nets to correctly create the decision regions for a large number of code words, a training set of equal extent is required, which results in impractically large training time (assuming that conversion will occur at all), storage space, and number of neurons. It is reported in W. R. Caid and R. W. Means, "Neural network error correcting decoders for block and convolution codes," GLOBECOM '90 IEEE Global Telecommunications Conference and Exhibition, vol. 2, pages 1028-1031, Dec. 1990, that the neural network decoding system with training is limited to very small codes like the Hamming code and convolutional codes with constraint lengths K less than or equal to 3. The constraint length K of a convolutional code is defined as the maximum number of bits in a single output stream that can be affected by any input bit. Second, the algebraic structure of the code words is not efficiently used in a trained ANN convolutional decoding system 18. For these reasons, the design of ANN convolutional decoding systems 18 has been a process of "neuralizing" the existing digital coding algorithms which have themselves been derived by fully exploiting the algebraic properties of the codes. The resulting_ANN convolutional decoding systems 18 have thus been characterized by fixed-weight and training-free networks.
Having recounted the above issues, it should be noted that ANN decoding systems 18 exhibit important advantages over their digital counterparts. One advantage is that the decoding process can be maximally parallelized by an ANN, which greatly increases the decoding system throughput. Another advantage is that neural network convolutional decoding systems 18 can lead to simpler VLSI (Very-Large-Scale Integrated Circuit) realization, because neurons of a given type have identical characteristics, and most internodal connections have weights of either +1 or -1 and tend to run along very regular patterns.
Several ANN decoding systems have been developed for convolutional codes. See for example, the aforementioned by W. R. Caid and R. W. Means as well as M. D. Alston and P. M. Chau, "A neural network architecture for the decoding of long constraint length convolutional codes," 1990 International Joint Conference on Neural Networks - IJCNN 90, pages 121-126, June 1990. For convolutional codes, it is well known that the Viterbi algorithm provides maximum likelihood decoding and can be considered optimal. In this regard, see S. Wicker, Error Control Systems for Digital Communication and Storage, Englewood Cliffs: Prentice Hall, 1994, and A. J. Viterbi, "Error Bounds for Convolution Codes and an Asymptotically Optimum Decoding Algorithm," IEEE Transactions on Information Theory, IT-13, pages 260-269, April 1967.
In a decoding system for implementing the Viterbi algorithm., the decoding system is modelled around a trellis diagram, such as that shown in FIG. 3 for the encoding system 13 of FIG. 2. Each of the nodes in the trellis diagram, which represent states in the diagram, is assigned a number. The number is referred to as the partial path metric of a path which passes through that node. The assignment of numbers of the trellis nodes is routine until the point in the trellis where more than one path enters a node. In this case, the node label chosen is the "best" (largest or smallest) partial path metric among the metrics for all of the entering paths. The path with the best metric is the survivor, while the other entering paths are nonsurvivors. If the best metric is shared by more the one path, then the best path is chosen at random. The Viterbi algorithm terminates when all of the nodes in the trellis diagram have been labeled and their entering survivors determined. Then, the paths which survived are traced back from the last node in the trellis diagram to the first node. Because each node has only one entering survivor, the trace-back operation always yields a unique path, and the unique path yields the best approximation of the input data stream.
However, the ANN convolutional decoding systems 18 discussed in the literature to date are suboptimal. For example, the ANN convolutional decoding system 18 in the aforementioned Caid et al. article uses training to establish the coefficients for the Viterbi algorithm, and is thus limited to very small constraint codes (K.ltoreq.3). In any case, it is outperformed by comparable digital implementations of the Viterbi algorithm. Moreover, the decoding system in the aforementioned Alston et al. article can deal with codes of long constraint length K, but also provides suboptimal performance. The Alston et al. decoding system allows for several possible decision rules, and the best one has not yet been found. Other work related to the Viterbi decoding system includes the neural network developed in Y. Wu et al., "Dynamic adaption of quantization threshold for soft-decision Viterbi decoding with a reinforcement learning neural network," Journal of VLSI Signal Processing, pages 77-84, Volume 6, No. 1, June 1993. This embodiment dynamically adjusts the soft quantization threshold, thus acting as a "coprocessor" for a conventional, digital implementation of a Viterbi decoding system.
There are many VLSI implementations of digital Viterbi decoding systems. As an example, see P. J. Black and T. Meng, "Hybrid survivor path architectures for Viterbi decoders," Proceedings of ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing, pages 433-436, Volume 1, April, 1993. The complexity of the digital designs governs the decoder throughput and chip size. However, these digital designs are inherently serial and complex, thus resulting in suboptimal performance and requiring an undesirable amount of chip space.