1. Field of the Invention
The present invention relates to the field of signal processing, and in particular, relates to the field of signal processing wherein analog waveforms are encoded as digital signals in a manner such that the digital signals can be transmitted over transmission channels of reduced bandwidth and such that the bandwidth of the signal can be reduced while still obtaining high subjective signal quality. More particularly, the invention relates to the field of electronic signal processing wherein the times of occurrence of extrema, or maxima and minima points of an analog waveform, are encoded. The invention further relates to the processes of encoding, transmitting and decoding of information that is supplied to the human sensory system or to mechanisms that are used to simulate the human sensory system. For example, the present invention finds application in voice coding, music coding and video coding, and may be implemented in a coder-decoder (CODEC) system. As an example, the present invention may be used to code voice information digitally at rates of 4.8 to 32 kilobits per second. Music information may be coded digitally at rates as low as 16 kilobits per second, and possibly even lower, with the present invention, and video information may be encoded at rates of from 56 kilobits per second to 1.544 megabits per second.
2. Description of the Prior Art
During the past several years, a number of different systems for converting analog signals to digital signals have been introduced.
In general, such techniques are directed at maintaining a close approximation to the original analog signal, at least at the point where the analog signal is fed to the digitizing stage. Most of these schemes are based upon one of the following methods.
A first technique is known as pulse code modulation (PCM) wherein, samples of the amplitude information of the waveform are taken at usually regular time intervals. The number of samples per second is determined by the bandwidth for the input signal according to the Nyquist relationship, i.e., the sampling rate must be at least twice the frequency of the highest frequency component of the analog signal to be encoded. The accuracy of this process depends also on the resolution of the way in which the amplitude of each sample is encoded. The higher the accuracy, the more bits of information are needed. In general, amplitudes are quantized by comparing each sample with a multitude of predetermined levels.
A second technique is known as delta modulation (.DELTA.M). Delta modulation does not utilize discrete amplitude samples of a waveform. Instead, it relies on the continuous comparison of the input signal with a signal readily reconstructed from a digital format, which is usually applied to an integrating circuit. For example, in delta modulation, typically the input signal present value is compared to a signal which is related to the value of the previous sample, and a digital signal is formed which represents the difference. The output of a delta modulator provides a continuous bit stream having, e.g., a "1" if the reconstructed signal has an amplitude value lower than the input and otherwise a "0".
The accuracy of the delta modulation process again depends on the number of bits per second that are employed. In this case the bit rate will also determine the maximum bandwidth of the input. In .DELTA.M, however, an unweighted code is used, i.e., a "block" of bits or a word, as in PCM, does not represent an amplitude sample. Rather, a one or a zero simply represents the result of the comparison performed by the delta modulator.
The performance of both PCM and .DELTA.M depends on the bit rate that is allowable. The use of high bit rates is expensive due to circuit complexity and also because channels must be of high quality to pass high bit rates. Furthermore, many times channels of the quality necessary to transmit high bit rates are simply not available. Many attempts have been made to provide the same performance at low bit rates that PCM and .DELTA.M provide at higher bit rates.
If input signals to either PCM or .DELTA.M vary in amplitude over only a limited dynamic range, performance will be good at relatively low bit rates. This is the result of the fact that in terms of information, that is, in terms of quantization noise, a signal of little variation can be represented well by few bits. In linear PCM or .DELTA.M, a low level variation will be compared with few amplitude levels, therefore, with low accuracy. A high level variation will be compared with many levels, with a much lower error rate. Improving the signal to quantization noise ratio for low level inputs would require an increase of the transmission bit rate.
A well known solution, but one which has its own limitations and faults, is called companding. A non-linear compression circuit is used to raise low level intensities that are then compared with far more quantization levels. High input amplitudes are attenuated such that the number of comparative levels drops, thus equalizing the encoding resolution for both low and high intensities. The inverse of compression, known as expanding, is then carried out in the decoder.
Non-linear encoding techniques, such as A law PCM, u law PCM and companded .DELTA.M are also known, but these techniques still require relatively high bit rates to achieve a practical measure of encoding accuracy.
A further step in bit rate reduction has been provided by techniques known as automatic gain control (AGC) and adaptive quantization. For PCM, these systems were introduced as adaptive PCM. The corresponding .DELTA.M techniques are known as continuously variable slope .DELTA.M (CVSD) and digitally controlled .DELTA.M.
These techniques use varying quantization levels or varying step levels, determined by some measure of the energy in a signal variation at a particular time. This allows for more accurate quantization as the digitizer is adjusted for each type of signal level.
AGC and adaptive quantization methods suffer from several major drawbacks. For one, the adjustment of the system to the input takes time, during which the system is less effective. A second drawback involves the fact that the measure of energy that is used should be derived from only the signal. Often signals are presented with a high degree of interference or noise in which case the system may adjust itself to the energy in the interference or noise. The desired signal may then be attenuated.
Further data rate reduction for PCM and .DELTA.M systems has been demonstrated by using predictive coding methods. In these techniques, the redundancy of certain waveforms, for example, the repetition of certain characteristics, are used to reduce the amount of information which must be transmitted. These methods are not widely applicable, and are utilized in more narrow fields of application.
Another technique that has been suggested employs dual channels for information transfer. U.S. Pat. No. 4,047,108 describes a system for low bit rate digital transmission of speech signals wherein frequency information is transferred via a first channel and amplitude information via a second. In this reference, delta modulators are used to digitize the speech information.
Another technique, which is the subject of U.S. patent application Ser. No. 372,538, filed Apr. 28, 1982, is extrema coding. Extrema coding exploits certain properties of the human perception system to achieve a substantial reduction of the data rate necessary to transmit information.
Extrema coding relies on the fact that only certain timing features, i.e., the extrema, of a stimulus waveform are required to reconstruct a wave form that can be supplied to the human sensory system in such a manner that subjectively, no inequality with the original signal is experienced by the human receiver.
By only encoding the information in these timing features, a large portion (up to 95%) of the information in the original waveform can be made redundant. Extrema coding techniques may give a data rate reduction factor of from 2 up to a factor of 20. Extrema coding, as such, is not an analog to digital conversion method in itself. Practical embodiments, however, may supply all relevant information about an analog signal in a binary format. This binary sequence may then be fully digitized, as will be explained in more detail below.
One way of digitizing, or synchronizing the extrema coded information to a digital signal, is disclosed in the above U.S. patent application. One method that is suggested therein uses a simple D type flip-flop to synchronize the extrema coded signal to a predetermined clock signal.
Although this technique is simple and supplies signals of superior intelligibility for speech processing purposes, the quality of signals that may be obtained using this technique at low clock rates is relatively low.
Extrema coding relies heavily on the presence of short distances between the transitions of the encoded binary signal as a result of the dominance of wide band noise originally present in the signal or added to the signal. These short distances cannot be encoded properly at low bit rates using the simple D-type flip-flop synchronizing technique. The erroneously encoded distances at low clock rates may cause subjective degradation of the original analog waveform, typically at bit rates below 24 kilobits per second.