The present invention can be more easily understood in terms of a simple exemplary system. Consider a telephone conversation in which a person talks into a microphone whose output is digitized and then transmitted to a second person via various telephone lines and switch systems. The speaker at the second person's location receives a sequence of digital values that are then played back to the second person. In general, the received sequence will differ from the transmitted sequence because of errors introduced by the transmission system, digital-to-analog converters, and analog to digital converters. For example, noise in the transmission system results in some of the digital values in the transmitted sequence being altered. One goal of a denoising system is to remove as many of these noise errors as possible.
The simple example discussed above is an example of a more general problem that is encountered in a wide range of applications. In general, an input digital signal that consists of a sequence of “symbols” is transmitted through a “communication link” and is received as an output digital signal at the output of the communication link. The output digital signal also consists of a sequence of “symbols”. Each of the symbols is chosen from a predetermined set of symbols, referred to as an alphabet. For simplicity, the output signal is assumed to be written in the same alphabet as the input signal.
In the simplest case, the signals are binary signals in which the alphabet consists of the symbols “0” and “1”. In this case the input and output signals consist of a sequence of 0s and 1s. However, other alphabets are commonly used. For example, a digitized signal in which each symbol is represented by an integer between 0 and M-1 is commonly used in broadband data transmission systems for connecting users to the Internet via a digital subscriber loop (DSL).
While the above examples refer to communication systems, it should be noted that this type of noise problem is present in a number of data processing systems. For example, the storage of data files on a magnetic disk drive can be viewed as the transmission of a digital signal through a communication link, the disk drive. The input signal is a sequence of symbols, e.g., bytes of data, which are chosen from a predetermined alphabet. In the case of byte data, each symbol has an integer value chosen from the set [0,1, . . . ,255]. The retrieved file from the disk drive also consists of a sequence of symbols chosen from this set. The input signal symbols are processed by the electronics of the disk drive and stored in the form of localized magnetic fields that are read to generate the output signal. Noise in the digital to analog circuitry that converts the symbols to and from the magnetic fields introduces errors into the output signal. In addition, the magnetic fields can be altered during storage by random events that introduce additional errors.
In a co-pending patent application, U.S. Ser. No. 10/688,520, a denoising system is described that utilizes a knowledge of the behavior of the channel and a measure of the amount of degradation that occurs if a symbol is converted by the channel to another symbol. This application is hereby incorporated by reference. In this system, the channel behavior is characterized by a matrix whose entries are the probability that a symbol having the value A is converted to a symbol having the value B. Here, A and B run over all the values in the alphabet used by the channel. This matrix will be referred to as the channel matrix in the following discussion.
This system also assumes that the channel does not have a memory. That is, the probability that a symbol will be erroneously converted to another symbol is independent of the symbols that preceded or followed that symbol. However, this system may still provide advantages if this assumption is not met.
This previously described system alters the received signal in a manner that depends on the frequency with which certain “context” sequences of symbols having predetermined lengths are present in the signal. The received signal is altered in a manner that is estimated to reduce the overall signal degradation in the received signal based on estimates that depend on the channel matrix and the degradation caused when a symbol is wrongfully converted to another symbol.
While this previously described system represents a substantial improvement over other systems it suffers from two problems. First, this system requires that the entire signal be received and analyzed before the denoising algorithm is applied. Hence, the denoising system must have sufficient storage to hold the entire received signal. In addition, the symbols of the corrected sequence are delayed by a time that is at least the time required to receive the entire signal. For a very long signal, the needed storage and delays are often unacceptable.
The second problem relates to choosing the length of the context sequences. In the simplest case, this denoising system generates a table of the frequency with which all sequences of a specified length, K, occur in the signal. The denoising algorithm's success depends to some degree on K. If K is too small, the number of noise errors that can be corrected is less than the optimum number. If K is too large, the statistical accuracy of the frequency data is too low to make accurate denoising decisions. That denoising algorithm attempts to overcome these problems to find the best K value by utilizing an estimate for K based on some statistical assumptions that are often, but not always true, or by recording the frequencies for sequences having a number of different K values and utilizing different K values for different sequences in the received signal.