This invention relates to information processing systems, and in particular, systems involving the transmission or storage of digital information via some imperfect medium which can introduce errors (i.e. gained or lost "bits") into the information and means for protecting the information system against the adverse effects of such errors.
Very often the transmission or storage of a digital (e.g. binary) sequence introduces the possibility of the message being altered by random interferences arising in the medium which cause one or more bits comprising the binary data of the message to be changed from "1" to "0" or vice versa: telephone lines can be subject to error impulses and interference from other messages; magnetic tapes and discs can have minor imperfections; generally any processing by circuits with tubes, transistors, diodes, or elements subject to electrical failure has a possibility of introducing such errors.
To check for errors, data can be echoed back to the sender by the receiver for verification before being acted upon. This technique is effective but adds greatly to the system overhead. Moreover, under particularly bad communications conditions it may be virtually impossible to transmit error-free messages so that verification is non-existent.
One of the key results of early work in information theory was that it is theoretically possible, by means of coding which adds carefully designed redundancy to the message, to protect information against such errors. The original message is encoded into a longer message from which a decoder can recover the original message, even if the longer message has been partially mutilated.
For example, each one could be replaced by three ones and each zero by three zeros. In such a scheme if the message to be sent were 1010, it would be encoded and transmitted as 111000111000. Such a system could correct one error in each encoded digit. If a 000 sequence picked up a bit in transmission and appeared to the receiver as 100, 010, or 001, it would be closer to 000 than 111 so that the digit 0 could still be validly interpreted and the error corrected by the decoder at the receiving end. It can be seen that two errors in the same digit would be fatal. A 000 received as 110, 011, or 101 would appear most likely to have been, or closer to, 111 than 000 and be misinterpreted as a 1 rather than a 0.
Continuing in the same manner, each one or zero could be transmitted as a larger number of ones or zeros, say ten, twenty, or more. This is shown diagrammatically in FIG. 1. A four digit message word 10 is mapped into the codeword 12 which is then transmitted in a time domain represented by the arrow 14. Sources of error, however, quite often occur in "bursts" lasting for a finite period of time as represented by the cross-hatched area 16. Since all the data representing each digit is grouped together, such a scheme is highly vulnerable to burst errors as can be seen.
FIG. 2 diagrammatically shows a typical solution to the burst error problem. That is, the codeword 12' is made long and the data for each digit is scattered throughout the length of the codeword 12' so that a single burst error can effect only a portion of the data for any individual digit leaving enough data for each digit to permit satisfactory error correction in the received and decoded message.
Although tremendous attention has been devoted to the problem of designing systems to carry out this theoretically possible protective function, all attempts to data have shortcomings which limit their effectiveness in practical applications, particularly for relatively long codes. The shortcomings arise in different ways, and thus it is possible through systems design to trade off one weakness against another, designing a system to avoid one limitation by allowing another.
To understand the advance made by the present invention, it is necessary to examine more closely the nature of the difficulties standardly encountered. For purposes of discussion, the problem will be treated as one of protecting messages generated by a message source, transmitted across a noisy channel, and received at a message sink. As will be recognized by those skilled in the art, the same conceptual framework is applicable to problems of information storage and to some forms of computation.
In protected message transmission, the original message is encoded at the sending end by an encoder, the encoded message is then transmitted, and finally the received message is decoded by a decoder to produce the original message at the receiving end.
An encoder can be thought of as a device which implements a function which maps a sequence of message digits into a longer sequence of channel digits, the encoded message. Because the channel sequence is longer, it is possible to find functions which can map any two distinct messages into very different channel sequences. With sufficiently different sequences, several errors can occur, changing one channel digit into another, without causing the received sequence to resemble a different encoded message more closely than it does the one actually sent. Thus if every received sequence is assumed to be a mutilated version of the correct encoded message it resembles most closely, the message will be protected against some error patterns.
The encoder determines the quality of the code: a good encoder guarantees that the encoded messages are all very different.
A decoder implements a function mapping the long sequence of received symbols back into sequences of message digits. The best decoder will map each possible received sequence into the message sequence which is the most probable cause of that received sequence, taking the characteristics of the possible errors into account.
The fundamental obstacle to the design and use of the "best" encoders and decoders is complexity. First, it is impossible to find the "best" code for anything but very short length codes. There are simply too many possible codes, and even the fastest existing computers cannot exhaustively compare all the codes to determine the best one. Second, even if the best codes could be found, they might still be impractically complex just to use.
In the past, it has been typical to design the code first and then a means for implementing it. The result of this approach is a vast array of implementation techniques which impose far less than optimal space and time limitations on the using equipment. As far as actual implementation is concerned, table look-up, which can be used to implement any encoder and decoder functions, requires a table of a size which is exponentially related to the code length. Thus, it is totally impractical to use table look-up for any really useful code. Consequently, encoder and decoder mappings are normally carried out by a computational process. With a computational process, the difficulty is that good encoders and decoders can require so many computations that they are too costly and too slow to be useful. In virtually all existing coding schemes, the decoder complexity is the limiting factor. The largest length code which can be practically used is determined by the acceptable computational load for the decoder.
To minimize the complexity problem, compromises are possible. The encoding and decoding can be simplified by using a simpler, but less effective, code. Without changing the code, it is often possible to simplify the decoder by allowing the decoding to fail. That is, have a finite probability that the received sequence may sometimes be incorrectly decoded or not decoded at all. Other systems will operate effectively against only certain types of errors, such as burst errors as described above, where errors are most likely to occur in contiguous symbols. This general strategy is one of designing systems based on a model of the channel which is simpler than the actual channel.
An important case of the latter strategy typically employed is the discarding of channel reliability information. For example, on a binary channel where zeroes and ones are transmitted as two pre-assigned voltage levels, the receiver often has available more than just a zero or a one.
This is shown diagrammatically in FIG. 3. A "1" is represented by voltage level V.sub.1 and a "0" is represented by voltage level V.sub.0. If a pure 0 and 1 are transmitted, as represented by the two voltage levels 18 and 20, the received voltage levels 18' and 20' may not be exactly V.sub.0 and V.sub.1. If the received voltage level were midway between V.sub.0 and V.sub.1 as represented by dotted line 22, the probability of the unknown signal being a 1 or a 0 would be 0.5. That is, by picking either value there is a 50% chance that the choice is correct. The closer the received voltage is to the V.sub.1 and V.sub.0 voltage, the higher the probability that it is the digit represented by that voltage.
By taking advantage of the foregoing, the demodulation equipment may provide an additional continuous voltage, or the like, which indicates not only whether a zero or a one was apparently transmitted, but also the relative likelihood of each. This likelihood information is very valuable and, theoretically, can be used to reduce substantially the probability of system error. Unfortunately, most existing coding schemes are unable to use it. They operate simply on a "best guess" for each received signal. Once the guess is made there is no changing it. The popularity of a coding technique known as "convolutional coding" rests heavily on the ability of the associated decoding apparatus to make use of this data reliability information. Unfortunately, convolutional coding also produces localized coding as in FIG. 1 which is highly burst error sensitive.
With this background, the objects of the present invention can be summarized as follows: First, to provide a technique and environment for implementing codes having quality comparable to, and in some cases, superior to, that of the best known codes. Second, to provide a technique and environment for the construction of codes of almost any desired length and rate, and particularly powerful for construction of long length codes. Third, to provide a technique and environment for the design and implementation of codes effective against a wide variety of channel models and making near-optimum use of channel reliability information. Finally, to provide both an encoding and decoding process and apparatus structured to be of low computational complexity. It is the prime objective that the unique structure of the decoding process lead to orders of magnitude reduction in the cost of decoding long codes when compared to other coding schemes offering similar levels of error protection.