1. Field of the Invention
This invention relates to lossless audio codecs and more specifically to a lossless multi-channel audio codec using adaptive segmentation with random access point (RAP) capability and multiple prediction parameter set (MPPS) capability.
2. Description of the Related Art
Numbers of low bit-rate lossy audio coding systems are currently in use in a wide range of consumer and professional audio playback products and services. For example, Dolby AC3 (Dolby digital) audio coding system is a world-wide standard for encoding stereo and 5.1 channel audio sound tracks for Laser Disc, NTSC coded DVD video, and ATV, using bit rates up to 640 kbit/s. MPEG I and MPEG II audio coding standards are widely used for stereo and multi-channel sound track encoding for PAL encoded DVD video, terrestrial digital radio broadcasting in Europe and Satellite broadcasting in the US, at bit rates up to 768 kbit/s. DTS (Digital Theater Systems) Coherent Acoustics audio coding system is frequently used for studio quality 5.1 channel audio sound tracks for Compact Disc, DVD video, Satellite Broadcast in Europe and Laser Disc and bit rates up to 1536 kbit/s.
Recently, many consumers have shown interest in these so-called “lossless” codecs. “Lossless” codecs rely on algorithms which compress data without discarding any information and produce a decoded signal which is identical to the (digitized) source signal. This performance comes at a cost: such codecs typically require more bandwidth than lossy codecs, and compress the data to a lesser degree.
FIG. 1 is a block diagram representation of the operations involved in losslessly compressing a single audio channel. Although the channels in multi-channel audio are generally not independent, the dependence is often weak and difficult to take into account. Therefore, the channels are typically compressed separately. However, some coders will attempt to remove correlation by forming a simple residual signal and coding (Ch1, Ch1-CH2). More sophisticated approaches take, for example, several successive orthogonal projection steps over the channel dimension. All techniques are based on the principle of first removing redundancy from the signal and then coding the resulting signal with an efficient digital coding scheme. Lossless codecs include MPL (DVD Audio), Monkey's audio (computer applications), Apple lossless, Windows Media Pro lossless, AudioPak, DVD, LTAC, MUSICcompress, OggSquish, Philips, Shorten, Sonarc and WA. A review of many of these codecs is provided by Mat Hans, Ronald Schafer “Lossless Compression of Digital Audio” Hewlett Packard, 1999.
Framing 10 is introduced to provide for editability, the sheer volume of data prohibits repetitive decompression of the entire signal preceding the region to be edited. The audio signal is divided into independent frames of equal time duration. This duration should not be too short, since significant overhead may result from the header that is prefixed to each frame. Conversely, the frame duration should not be too long, since this would limit the temporal adaptivity and would make editing more difficult. In many applications, the frame size is constrained by the peak bit rate of the media on which the audio is transferred, the buffering capacity of the decoder and desirability to have each frame be independently decodable.
Intra-channel decorrelation 12 removes redundancy by decorrelating the audio samples in each channel within a frame. Most algorithms remove redundancy by some type of linear predictive modeling of the signal. In this approach, a linear predictor is applied to the audio samples in each frame resulting in a sequence of prediction error samples. A second, less common, approach is to obtain a low bit-rate quantized or lossy representation of the signal, and then losslessly compress the difference between the lossy version and the original version. Entropy coding 14 removes redundancy from the error from the residual signal without losing any information. Typical methods include Huffman coding, run length coding and Rice coding. The output is a compressed signal that can be losslessly reconstructed.
The existing DVD specification and the preliminary HD DVD specification set a hard limit on the size of one data access unit, which represents a part of the audio stream that once extracted can be fully decoded and the reconstructed audio samples sent to the output buffers. What this means for a lossless stream is that the amount of time that each access unit can represent has to be small enough that the worst case of peak bit rate, the encoded payload does not exceed the hard limit. The time duration must be also be reduced for increased sampling rates and increased number of channels, which increase the peak bit rate.
To ensure compatibility, these existing coders will have to set the duration of an entire frame to be short enough to not exceed the hard limit in a worst case channel/sampling frequency/bit width configuration. In most configurations, this will be overkill and may seriously degrade compression performance. Furthermore, this worst case approach does not scale well with additional channels.