1. Field of the Invention
This invention relates to lossless audio codecs and more specifically to a scalable lossless audio codec and authoring tool.
2. Description of the Related Art
Numbers of low bit-rate lossy audio coding systems are currently in use in a wide range of consumer and professional audio playback products and services. For example, Dolby AC3 (Dolby digital) audio coding system is a world-wide standard for encoding stereo and 5.1 channel audio sound tracks for Laser Disc, NTSC coded DVD video, and ATV, using bit rates up to 640 kbit/s. MPEG I and MPEG II audio coding standards are widely used for stereo and multi-channel sound track encoding for PAL encoded DVD video, terrestrial digital radio broadcasting in Europe and Satellite broadcasting in the US, at bit rates up to 768 kbit/s. DTS (Digital Theater Systems) Coherent Acoustics audio coding system is frequently used for studio quality 5.1 channel audio sound tracks for Compact Disc, DVD video, Satellite Broadcast in Europe and Laser Disc and bit rates up to 1536 kbit/s.
An improved codec offering 96 kHz bandwidth and 24 bit resolution is disclosed in U.S. Pat. No. 6,226,616 (also assigned to Digital Theater Systems, Inc.). That patent employs a core and extension methodology in which the traditional audio coding algorithm constitutes the ‘core’ audio coder, and remains unaltered. The audio data necessary to represent higher audio frequencies (in the case of higher sampling rates) or higher sample resolution (in the case of larger word lengths), or both, is transmitted as an ‘extension’ stream. This allows audio content providers to include a single audio bit stream that is compatible with different types of decoders resident in the consumer equipment base. The core stream will be decoded by the older decoders which will ignore the extension data, while newer decoders will make use of both core and extension data streams giving higher quality sound reproduction. However, this prior approach does not provide truly lossless encoding or decoding. Although the system of U.S. Pat. No. 6,226,216 provides superior quality audio playback, it does not provide “lossless” performance.
Recently, many consumers have shown interest in these so-called “lossless” codecs. “Lossless” codecs rely on algorithms which compress data without discarding any information. As such, they do not employ psychoacoustic effects such as “masking”. A lossless codec produces a decoded signal which is identical to the (digitized) source signal. This performance comes at a cost: such codecs typically require more bandwidth than lossy codecs, and compress the data to a lesser degree.
The lack of compression can cause a problem when content is being authored to a disk, CD, DVD, etc., particularly in cases of highly un-correlated source material or very large source bandwidth requirements. The optical properties of the media establish a peak bit rate for all content that can not be exceeded. As shown in FIG. 1, a hard threshold 10, e.g., 9.6 Mbps for DVD audio, is typically established for audio so that the total bit rate does not exceed the media limit.
The audio and other data is laid out on the disk to satisfy the various media constraints and to ensure that all the data that is required to decode a given frame will be present in the audio decoder buffer. The buffer has the effect of smoothing the frame-to-frame encoded payload (bit rate) 12, which can fluctuate wildly from frame-to-frame, to create a buffered payload 14, i.e. the buffered average of the frame-to-frame encoded payload. If the buffered payload 14 of the lossless bitstream for a given channel exceeds the threshold at any point the audio input files are altered to reduce their information content. The audio files may be altered by reducing the bit-depth of one or more channels such as from 24-bit to 22-bit, filtering a channel's frequency bandwidth to low-pass only, or reducing the audio bandwidth such as by filtering information above 40 kHz when sampling at 96 kHz. The altered audio input files are re-encoded so that the payload 16 never exceeds the threshold 10. An example of this process is described in the SurCode MLP—Owner's Manual pp. 20-23.
This is a very computationally and time inefficient process. Furthermore, although the audio encoder is still lossless, the amount of audio content that is delivered to the user has been reduced over the entire bitstream. Moreover, the alteration process is inexact, if too little information is removed the problem may still exist, if too much information is removed audio data is needlessly discarded. In addition, the authoring process will have to be tailored to the specific optical properties of the media and the buffer size of the decoder.