The data rate occupied by audio or other waveform data in digital-PCM (pulse code modulation) form is often too high for the transmission or storage medium used to convey waveforms. Methods of reducing waveform data rate are known in the prior art and may be classified into two types, lossy and lossless coding. Lossy coding discards or alters the waveform data in a way which is small in relation to the requirement of how the data is used, whereas lossless coding reduces the data rate by eliminating signal redundancies in coded form, but in a way that allows the exact original data to be recovered by a decoding process.
Such lossless coding methods based on the use of predictors are known in the prior art and are described for example in C. Cellier, P. Chenes & M. Rossi, “Lossless Audio Bit Rate Reduction”, Audio Engineering Society UK “Managing The Bit Budget” Conference proceedings, 16-17 May 1994, pp.107-122, in R. C. Gonzales & R. E. Woods, “Digital Image Processing”, Addison Wesley, Reading Mass., 1992 Chapter 6, esp. section 6.4.3 pp.358-362 and in M. Rabbani & P. W. Jones, “Digital Image compression Techniques”, SPIE Press, Bellingham, Wash. 1991.
PCM signals may be considered as integer valued time series signals, where the integer is a multiple of the value of the least significant digit. The basic concept in prior art systems is to encode the integer PCM signal via a prediction filter where the quantizer comprises a rounding operation to the nearest integer, transmitting the quantized difference (termed here the prediction-encoded signal) between the actual signal and predicted signal derived from the output of the quantizer, and then to transmit this encoded data efficiently by means of Huffman coding or by transmitting the number of zero MSBs (most significant bits) once only for a block of words or similar techniques of reducing the wordlengths of the individual samples of the encoded waveform. In such prior art systems, lossless decoding is done by using Huffman or other appropriate decoding to restore the wordlength of the encoded signal, and then to pass the encoded data into an identical predictor filter to that used in encoding, to add the result to the encoded signal, and then to restore the original integer valued signal by means of a second rounding quantization operation. The rounding operations may be omitted if the prediction filters have only integer coefficients.
However, in many applications, prior art methods of lossless encoding and decoding of waveform data have considerable practical problems. This is particularly the case with high quality PCM audio data, especially when transmitted through media with limitations on the peak data rate at which data can be transferred, such as compact disc players or digital tape recorders.
By high quality audio we mean signals which in PCM form will typically require 16 or more bits, perhaps as many as 20 or 24 bits, for accurate representation of the digital words, and sampling rates of 44.1 kHz or higher. Lossless compression of audio data is especially useful when in addition the sampling rate is a high figure such as kHz. Such high sampling rates are coming into use for the case where an extended audio bandwidth is required for premium quality of reproduction. When it is desired in addition to convey multichannel stereo or surround sound, one may need to convey to the user perhaps 5 or 6 channels of audio at a 96 kHz sampling rate with around 20 bit resolution, and the resulting data rates of around 11.5 Mbit/second are difficult to convey with adequate playing time via existing storage media such as high-density compact disc or optical storage media.
In any case, lossless coding and decoding of such high quality audio data allows the effective capacity of storage media such-as hard disc in computer-based audio processing systems to be increased, as well as increasing the effective storage capacity of storage media such as compact disc, digital tape media and the like. In such applications, it is desirable that especially the decoding algorithms should be relatively simple to implement, because the number of players may well outnumber the number of recorders by a large factor, especially for compact disc type releases of audio music programme material. There is also a requirement that the encoding and decoding algorithms be transportable to many different digital signal processing platforms without too much difficulty of engineering implementation, since encoded recordings produced by any one of many record companies or other organisations would be expected to playback on players of many different users made by many different manufacturers.
In the prior art, the simplest and in audio most widely used form of lossless waveform coding used is an integer prediction technique. This comprises transmitting not the PCM audio signal itself, but the difference between successive samples plus an initial sample, from which the original signal can be reconstructed by recovering each sample by adding the difference sample to the previously recovered sample. For typical audio signals, the difference signal will have lower energy than the original signal. A known and widely used prior-art extension of this integer prediction technique may instead transmit second or third differences of the signal along with two or three initial samples of the PCM signal. Using the symbol z−1 to indicate a delay by one sample, this method transmits the result of passing the signal through an encoding filter of the form (1−z−1)n for n=0, 1, 2 or 3. The original signal can be recovered from the data by an inverse summation process. The value of n may be chosen adaptively, block by block of audio waveform samples, so as to minimise the energy of the transmitted signal at each moment, since low-energy waveform data can be transmitted at a lower data rate than higher-energy waveform data.
Integer-coefficient predictors are found to reduce the average data rate and content of most audio signals, but can actually increase the peak data rate required for transmission. This makes such predictors unsuitable for coping with media having peak data read or write rate limitations. Also, the optimal prediction filter for minimising data rate is well known, see J. I. Makhoul, “Linear Prediction: A Tutorial Review”, Proc. IEEE, vol. 63, pp. 561-580 (1975 April), to be one such that the frequency response of the difference between actual and predicted signal is approximately inverse to the spectrum of the waveform signal to be encoded, and for many signals, integer-coefficient prediction filters only very poorly approximate this requirement. Thus integer filters give a suboptimum average data rate as well. For encoding audio signals, these inefficiencies of integer predictors particularly affect such signals as speech sibilants, popular music with high treble energy, cymbal waveforms and suchlike.
Predictors involving non-integer coefficients can encode waveforms with much better reductions of both peak and average data rates, but unfortunately, these have the problem that an ideal implementation requires the use of infinite-precision arithmetic, which is not possible. In practice, one uses prediction filters incorporating rounding errors in their arithmetic, and in such a case, it is essential for lossless coding that the rounding errors in the predictors be absolutely identical in the encoder and the decoder. This requirement of identical rounding errors makes it very difficult to transport a decoding or encoding algorithm between different signal processing hardware, where slight differences in rounding errors are encountered. In applications where a wide variety of equipment designs may be used to encode or decode signals, it is practically necessary to use algorithms that are transportable between different DSP (Digital signal processing) platforms which may not have identical rounding errors. Also, the need to control arithmetic rounding errors in predictors to be absolutely identical makes it very difficult to design alternative prediction filter architectures for particular applications when it is known that different encoders and decoders must work with each other.
In addition, existing non-integer lossless prediction algorithms add a quantization noise to the encoded signal that has a spectrum that is inverse to the frequency response of the difference between actual and predicted signal. For low-level waveform signals, the amplitude of this added quantization noise can dominate in the encoded signal, increasing its average amplitude and hence the encoded data rate unnecessarily.
Existing lossless prediction methods in addition only encode and decode single channels of waveform data separately from each other. In many applications, including stereo and multichannel audio, one wishes to encode two or more related waveform signals which quite often have a high degree of correlation. One wishes to have lossless coding which can take advantage of the redundancy due to such correlations to reduce the data rate further.