1. Field of the Invention
The present invention relates to error control coding systems and methods, and, more particularly, including the generation and decoding of pseudo-random code word messages such as would be useful for increasing the reliability of transferring information over noisy communication channels.
2. Description of the Related Art
Thermodynamics teaches that statistical fluctuations in the energy of media and systems cause a positive probability of errors being introduced in the process of transferring information from one point to another point. Information must be assembled, codified by symbols, encoded for transmission, converted into a physical signal, transported across a distance, received, decoded, and passed on to a user. At any stage in an electronic version of a communication process, transient errors due to temperature, cosmic rays, communication medium noise, receiver noise figure, and faulty apparatus and their components may occur. Noise in received signals, rather than apparatus failure, constitutes the primary factor limiting the performance of modern communication systems. Noise prevents the receiver(demodulator) from distinguishing one message (waveform) from another, thereby introducing uncertainties about the true nature of the information being received. It is therefore a problem to communicate reliably information in the presence of masking noise.
When the transmission of message information is accompanied by errors, the receiver of the information must perform operations that correct the errors so that an uncorrupted message can be presented to the user. For the correction process to be successful, it must be the case that not all possible received messages can be treated as valid, for otherwise the receiver would be incapable of distinguishing a valid message M1, from a corrupted valid message M2 wherein the corruption of M2 produced an apparent message identical to M1. Thus, the ability of a receiver to correct for errors introduced during transmission implies that redundant information must be added to valid messages so that the receiver can detect and possibly correct at least some errors. The added redundancy effectively lowers the rate at which useful message information can be transferred so that it is an advantage of error correction methods to be as efficient as possible without sacrificing error correcting ability.
Two primary methods currently exist for minimizing the impact of errors in electronic communication. In the first method, the energy per unit of information transferred is increased to the point where the raw signal to noise ratio exceeds the minimum value required for a tolerable rate of error production. In the second method error-control encoding techniques are used to add extra units of information the the message so that a receiver can detect and correct errors that occur at some maximum rate. Cost savings through the use of low-energy error control methods can be significant relative to the first method, even though an added complexity to the transmitter and receiver apparatii is required. The second error control method is the most widely used for the transfer and storage of digital information.
It is well known to encode the intelligence of information and transmit the encoded information to the receiver. Encoding normally adds redundant information to the message so that the receiver can detect, and in many cases correct, faulty received information. Simple error detectors and correcters are conventionally available for correcting minor errors. However, where the encoded message accumulates more than a couple of errors, such equipment is ineffective to correct errors.
In recent decades, much of the art in the field of error-control coding has addressed two essential problems; that of finding classes of code words that yield good error-control performance at various lengths, and then designing fast and cost effective circuitry to carry out the electronic control of errors.
In practice, a message to be transmitted by electronic means is encoded into a potentially long sequence of information symbols called bits by an error-control circuit, and then into a transmitted modulated waveform. A demodulation of this waveform at the receiver provides a sequence of bits to the error-control circuitry which uses the code word bits to make the best estimate of the message that was originally encoded.
One most widely used method of making accurate valid message identification is to associate each of the possible information units in a message with a unique code word designed to facilitate the detection and correction of message transmission errors. In binary error-control coding, the error-control circuitry accepts information bits at a rate Rs, adds the desired level of redundancy, and then generates code word bits at a higher rate Rc. In a block encoder, successive k-bit blocks of binary information are converted into successive n-bit blocks where n.gtoreq.k. The n-bit block is referred interchangibly to as the code word, or code block, or block code word. When encoding using a convolution code, the encoder accepts information as a continuous stream of bits, and generates a continuous stream of output code bits at a higher rate. The number of information bits used by the encoder in generating each output code bit is called the constraint length of the code.
Examples of often employed block codes include the parity check codes, product codes, binary repetition codes, binary hamming codes. Most of the successful block codes are called cyclic codes such as the Bose-Chaudhuri-Hocquenghem codes because their well defined algebraic structure makes practical the construction of low cost encoders and decoders using straightforward electronic means. However, all of these codes an all convolution codes suffer from an inability to correct errors when the error rate is very high in relation the message transmission rate. Good codes and circuitry exists for controlling small numbers of errors per code word received, but none of these conventional approaches have solved the problem of detecting and correcting errors when the probability of an error in a code word bit position is above a few percent. It is therefore a problem in the prior art to provide a method for correcting high rates of communication errors in a practical manner.
The Channel Coding Theorem first proven by Shannon states that every channel of communication has a channel capacity C, and that for any information transfer rate R&lt;C there exists code words of block length n that can be transferred at rate R such that the probability of incorrectly interpreting a message, P(E), is bounded by EQU P(E).ltoreq.2.sup.-nEb(R)
where Eb(R) is positive and is determined the physical and noise properties of the channel. This theorem implies that for any symbol transmission rate less than C, it is possible to reduce the probability of miscorrecting the errors in a noisy message to any degree required, even if the error rate is very high. In practice the symbol transmission rate is held fixed while the length of the encoded message (code word) is increased in length. The lower error rate is thus offset by the need to add more and more redundant symbols to the basic message to provide the information needed to correct errors, but there is no reason in principle that prevents the correction of arbitrarily high error rates. This result is valid for both fixed length, Block codes, and fixed constraint length, Convolution codes.
It is important to observe that very noisy channels require the use of very long code words, even for very simple messages. For example, even though each of the 26 letters of the english alphabet may be represented by a unique sequence of 5 binary (zeros and ones) bits, successful communication of a sequence of such characters over a noisy channel may require the encoding, transmission, reception and decoding of code words of tens or even hundreds of bits in length per character transferred.
Unfortunately, no general theory exists which specifies the construction of code words for very noisy communication channels. Moreover, as the length of code words increase, the burden and complexity of the encoder and decoder circuitry is also increased at least proportionately. It is in general quite difficult to construct efficient encoders and decoders of long code words using conventional methods, even if the expected error rates are small.
However, Shannon's main theorem of information theory proves that it is possible to signal reliably through the use of random encodings of message symbols. Consider the encoding of information using randomly selected binary code words of n bits in length. There are 2.sup.n such code words that can be selected, but if it is desired to guard against a large number of simultaneous errors in the communication process, then the admissible code words must be chosen to be very dissimilar so that they can be easily distinguished even when masked by noise. For binary code words, a measure of this similarity is called the "Hamming Distance." The hamming distance between any pair of code words of the same length is simply the total number of bit positions in which the code words are dissimilar. For example, the two code words, (1011100) and (0011110) have a Hamming distance between them of 2.
Code words consisting of long, random strings of zeros and ones symbols may be associated with valid messages. And because such bit strings are random and thus nearly uncorrelated (orthogonal), these special code words give the receiver the best chance of recovering from high levels of added noise.
One method of selecting a set of random code words for error-control encoding if to select each word by random coin tossing. That is, each bit in each code word is obtained by flipping an unbiased coin until all code words have been generated. The set of code words are then shared between the sender and receiver so that the sender and receiver share an identical table that associates a unique message symbol with a unique code word.
We can imagine the code words as points in an n-dimensional space of possible code words. We assume that the minimum Hamming distance between any pair of code words in the code word set is at least D.sub.min. By the law of large numbers there is, for a sufficiently large code word size n, an arbitrarily small probability that the received message will lie at or beyond a Hamming distance (D.sub.min -1)/2 from one of the uncorrupted code words as long as EQU D.sub.min .gtoreq.2ne+1,
where e is the probability of a random error at any bit position in the received message. Thus, if the receiver-decoder assumes that the codeword that is most similar to the received code word is in fact the actual code word sent, then up to ne errors in the received word can be corrected with high probability.
The method of using randomly selected code words to encode information to be sent over noisy channels seems attractive and easily implmentable due to their simplicity of generation. However, the use of random code words places a severe burden on the communication equipment since the random nature of the code words does not admit to a simpler mathematical representation of the code words other than simple tabulation. Unlike all code word types used in practice, random codes have no inherent pattern that can be used to simplify the encoding and decoding apparatus; no formula or pattern exists for computing the code words for each message. Moreover, the decoding apparatus must be relatively complex since the code lengths required to correct erros in very noisy channels must be very large. For although only a small number of valid code words may exist in the code word set, the number of possible messages that could be received is equal to the total number of possible code words that exist for codes of a given length. Thus the transmitter must select by table lookup one of 2.sup.S code words to encode one of 2.sup.S message symbols, and the receiver must provide an apparatus for decoding one of 2.sup.n possible receiver message patterns. Due to these factors the use of random encoding has been abandoned or unused by the art in favor of the use of highly structured code word sets that can be more easily generated and decoded.
Attempts to understand the functioning of the human brain have led to various "neural network" models in which large numbers of neurons are interconnected with the inputs to one neuron including the outputs of many other neurons. These models roughly presume each neuron exists in one of two states (quiescent and firing) with the neuron's state determined by the states of the input connected neurons (if enough connected neurons are firing, then the original neuron should be in the firing state); and the thrust of the models is to perform computations such as pattern recognition with the neural networks.
J. Hopfield, Neural Networks and Physical Systems with Emergent Collective Computational Abilities, 79 Proc. Natl. Acad. Sci. USA 2554 (1982) describes a neural network model with N neurons each of which has the value -1 or 1 (corresponding to the quiescent and firing states), so the state of the network is then a N-component vector V=[V.sub.1, V.sub.2, . . . , V.sub.N ] of -1's and 1's which depends upon time. The neuron interconnections are described by a matrix T.sub.i,j defining the influence of the j.sup.th neuron on the i.sup.th neuron. The state of the network evolves in time as follows: for each i, the i.sup.th neuron has a fixed threshold .theta..sub.i and readjusts its state V.sub.i randomly in time by setting V.sub.i equal to -1 or 1 depending on whether ##EQU1## is negative or positive. All neurons have the same average rate of readjustment, and the readjustments define a dynamical flow in state space.
With the assumption that T.sub.i,j is symmetric, the potential function ##EQU2## can be used to show that the flow of the network is to local minima of the potential function. Further, with a given set of s uncorrelated N-component binary (-1, 1) vectors, U.sup.1, U.sup.2, . . . , U.sup.S, a T.sub.i,j can be defined by ##EQU3## and the corresponding network with the thresholds .theta..sub.j set equal to 0 has these U.sup.k as the fixed points of the flow and thus stable states of the network. Such a network can act as a content-addressable memory as follows: the memories to be stored in the network are used to construct the U.sup.k and hence T.sub.i,j, so the stored memories are fixed points of the flow. Then a given partial memory is input by using it to define the initial state of the network, and the state will flow usually to the closest fixed point/stable state U.sup.k which is then the memory recalled upon input of the partial memory. This is the correlation used for decoding messages with the stable states corresponding to valid messages.
Further analysis and modified network models appear in, for example, J. Hopfield et al, Computing with Neural Circuits: A Model, 233 Science 625 (1986) and J. Hopfield, Neurons with Graded Response Have Collective Computational Properties like Those of Two-State Neurons, 81 Proc. Natl. Acad. Sci. USA 3088 (1984). FIG. 1 shows a simple neural network made from standard electronic components.
D. Ackley et al, A Learning Algorithm for Boltzmann Machines, 9 Cognitive Science 147 (1985) describe neural networks with additional adjustment mechanisms for the neurons which analogize thermal fluctuations; this permits escape from local minima of the potential function. However, this disrupts the flow to fixed points for memory recall of the Hopfield type neural networks.