In data compression schemes for data that represent real-time signals, data is compressed by an encoder, fed to a transmission medium and decompressed by a decoder. By transmission medium we include media such as disc and tape that play out at a later time, as well as near-instantaneous transmission links such as radio.
Such compression schemes fall into two broad categories: lossy and lossless, as discussed in reference [1] for audio signals. Both types of compression attempt to save data by exploiting the redundancy inherent in linear pulse code modulation (PCM); lossy compression in addition has the option to discard data, and thus not to reproduce the original exactly.
Well-known lossy compression systems for audio data include MPEG, Dolby Digital, PASC and ATRAC. All of these encode a linear PCM stream to a compressed stream that has a lower and constant data rate. Constancy of the encoded data rate is an important consideration for real-time transmission over channels such as radio links, or for linear media such as tape, where the read-out rate is the limiting factor. It is less important for storage on hard discs where the read-out rate is less of a bottleneck and the reason for the compression is simply to reduce the total amount of data.
A lossy compression scheme achieves a desired constant data rate by discarding data until the target rate is achieved. Lossless compression does not have this option, and as the redundancy in the original stream is variable (being, in general, greater during quiet passages than during loud passages) the data rate emerging from the lossless encoder is also variable. A variable-rate bitstream can be converted to a constant rate by stuffing, or padding, in which case the resulting constant data rate is equal to the peak data rate from the lossless encoder.
The concept of peak data rate requires some clarification. Many compression schemes make use of Huffman coding whereby a sample is encoded to a number of bits dependent on its magnitude. The encoding is optimised to minimise the average data rate resulting from an assumed probability distribution of the incoming samples, which typically models large samples as occurring with low probability. When a large sample does occur, the resulting number of bits will be above average and may well be greater than the number of bits used for the original PCM encoding. Hence stuffing the encoded stream to give a constant data rate equal to the instantaneous peak data rate is not sensible.
A more sensible measure of peak data rate is given by averaging over a time-window, and the resulting peak rate is then dependent of the length of the window. For example, the SQAM test CD ["Sound Quality Assessment Material CD" EPU 1988 4222042] contains 16-bit audio sampled at 44.1 kHz. Using a particular encoder, the peak data rate over a time-window of 160 samples is 14.68 bits/sample, whereas over a time window whose length is equal to the whole disc the peak data rate is only 4.68 bits/sample.
Many lossless encoders break the signal into frames of fixed-length, typically about 1000 samples, for analysis. The frame then provides the natural time-window over which to measure the peak data rate. The encoded frames will contain variable amounts of data and each frame can be serialised and transmitted while the next frame is being encoded. The peak data rate required of the transmission medium then corresponds to the encoded frame with the largest amount of data.
In this case the peak data rate can be reduced by using longer frames. However, the use of longer frames incurs the disadvantages of more buffer memory being required in the decoder, and of an increased latency between the application of the data-stream to the decoder and the emergence of the first decoded sample. The object of the present invention is to reduce these disadvantages.