This application is related to the art of information coding and more particularly to prediction-based coding of such information.
In the storage or transmission of information, it is frequently necessary to compress the data content of the information. Typically an information signal waveform (which may be a pure digital signal or a digitized analog signal) will consist of signed 16 bit numbers and there will be significant sample-to-sample correlation. Accordingly, a compression utility for these signals will desirably take advantage of the sample-to-sample correlation.
For signals which exhibit (or are expected to exhibit) such sample-to-sample correlation, compression is commonly achieved through prediction coding, where the goal is to use some portion of the preceding set of data to predict the next character in the data stream. The predicted value is then compared with the actual value of the predicted character, and a difference between the predicted and actual values is determined. That difference, or xe2x80x9cresidualxe2x80x9d as it is usually denoted, which will ordinarily be much smaller in magnitude than the actual character value, is then used to code for that actual value.
All types of compression fall into one of two categories: lossless and lossy. Lossless compression schemes enable all of the compressed data to be recovered after decompression. Lossy compression, on the other hand, connotes some loss of data between the original signal and the resultant of the compression/decompression process. In general, lossy compression methods are designed with a goal of making the loss of data largely immaterial to the receiving application. Lossy compression methods also generally provide a significantly greater compression ratio than can be obtained with lossless compression methods.
In many cases, linear prediction has been used in lossy speech coding. For example, linear predictive coding (LPC) and adaptive pulse code modulation (ADPCM) both have a prediction component, and also provide the foundation for other encoding techniques. Many speech coding standards are based on these. For example, Coded Excitation Linear Prediction (CELP) uses LPC and encodes residuals using a codebook of residuals from test speech samples. CCITT Standard G.721, used in digital telephone systems, builds on ADPCM (32 kb/s), while CCITT Standard G.728 (16 kb/s) uses a variant of CELP.
For music coding, on the other hand, the dominant form of coding has been based on perceptual and frequency domain characteristicsxe2x80x94the most widely-used method/standard being MPEG. Basically, perceptual coding exploits limitations of human hearing to remove inaudible components of audio signals. And, because the signal energy is concentrated in only certain areas of the frequency spectrum, these parts of the spectrum can be encoded with more resolution than the low-energy parts. Various transforms may be used to indicate what frequencies are contained in the signal, and their magnitudes. In the most recent MPEG version (MPEG-4, which is directed primarily to multimedia applications), either the discrete cosine transform (DCT) or the Fourier transform may be used. Since only significant frequencies are coded with perceptual coding (other frequencies being discarded), that coding results in lossy compression.
In general, the known speech coders do a poor job of encoding music and vice versa, although it can be important to be able to compress well both types of signals for certain applications such as for movie sound signals. There are other applications, including evidentiary matters, where any difference between an original and a reproduced signal is unacceptable, and thus lossy encoding cannot be used. Moreover, while a lay person may find music which has been subject to lossy encoding to be indistinguishable from the original, trained musicians may hear the differences between the original and compressed music, and find the lossy encoding to be unacceptable.
Accordingly, it is an object of the invention to provide a lossless coding methodology that effectively encodes both speech and music. To that end, a lossless encoding methodology is disclosed based on residual coding techniques and using a modified Least Mean Squares methodology to develop a predictor for a signal to be encoded, and a residual as the difference between the signal and its predicted value. After the residual for an input signal segment is obtained according to the method of the invention, that method is again applied to the residual value process to develop a second predictor, from which a second residual value is obtained. The method is then applied for at least one further iteration to the most recently obtained residual value process to develop a third predictor for the signal to be encoded. A single prediction value is then selected as a statistical representative of those multiple predictor values, or as a weighted combination of the multiple predictor values. The residual value to be used for encoding the input signal increment is determined as the difference between the signal value and the selected predictor value. The method of the invention encodes integer-valued digital signals by first obtaining an integer-valued predictor and then coding losslessly the prediction residuals, which are also integers.