The present invention relates generally to digital audio broadcasting (DAB) and other techniques for transmitting and receiving multiple program information, and more particularly to techniques for providing multiple program decoding for DAB and other applications.
Perceptual audio coding devices, such as the perceptual audio coder (PAC) described in D. Sinha, J. D. Johnston, S. Dorward and S. R. Quackenbush, xe2x80x9cThe Perceptual Audio Coder,xe2x80x9d in Digital Audio, Section 42, pp. 42-1 to 42-18, CRC Press, 1998, which is incorporated by reference herein, perform audio coding using a noise allocation strategy whereby for each audio frame the bit requirement is computed based on a psychoacoustic model. PACs and other audio coding devices incorporating similar compression techniques are inherently packet-oriented, i.e., audio information for a fixed interval (frame) of time is represented by a variable bit length packet. Each packet includes certain control information followed by a quantized spectral/subband description of the audio frame. For stereo signals, the packet may contain the spectral description of two or more audio channels separately or differentially, as a center channel and side channels (e.g., a left channel and a right channel).
PAC encoding as described in the above-cited reference may be viewed as a perceptually-driven adaptive filter bank or transform coding algorithm. It incorporates advanced signal processing and psychoacoustic modeling techniques to achieve a high level of signal compression. In brief, PAC encoding uses a signal adaptive switched filter bank which switches between a Modified Discrete Cosine Transform (MDCT) and a wavelet transform to obtain compact description of the audio signal. The filter bank output is quantized using non-uniform vector quantizers. For the purpose of quantization, the filter bank outputs are grouped into so-called xe2x80x9ccodebandsxe2x80x9d so that quantizer parameters, e.g., quantizer step sizes, are independently chosen for each codeband. These step sizes are generated in accordance with a psychoacoustic model. Quantized coefficients are further compressed using an adaptive Huffman coding technique. PAC employs a total of 15 different codebooks, and for each codeband, the best codebook may be chosen independently. For stereo and multichannel audio material, sum/difference or other form of multichannel combinations may be encoded.
PAC encoding formats the compressed audio information into a packetized bitstream using a block sampling algorithm. At a 44.1 kHz sampling rate, each packet corresponds to 1024 input samples from each channel, regardless of the number of channels. The Huffman encoded filter bank outputs, codebook selection, quantizers and channel combination information for one 1024 sample block are arranged in a single packet. Although the size of the packet corresponding to each 1024 input audio sample block is variable, a long-term constant average packet length may be maintained as will be described below.
Depending on the application, various additional information may be added to the first frame or to every frame. For unreliable transmission channels, such as those in DAB applications, a header is added to each frame. This header contains critical PAC packet synchronization information for error recovery and may also contain other useful information such as sample rate, transmission bit rate, audio coding modes, etc. The critical control information is further protected by repeating it in two consecutive packets.
It is clear from the above description that the PAC bit demand is derived primarily by the quantizer step sizes, as determined in accordance with the psychoacoustic model. However, due to the use of Huffman coding, it is generally not possible to predict the precise bit demand in advance, i.e., prior to the quantization and Huffman coding steps, and the bit demand varies from frame to frame. Conventional PAC encoders therefore utilize a buffering mechanism and a rate loop to meet long-term bit rate constraints. The size of the buffer in the buffering mechanism is determined by the allowable system delay.
In conventional single program PAC bit allocation, the encoder makes a request for allocating a certain number of bits for a particular audio frame to a buffer control mechanism. Depending upon the state of the buffer and the average bit rate, the buffer control mechanism then returns the maximum number of bits which can actually be allocated to the current frame. It should be noted that this bit assignment can be significantly lower than the initial bit allocation request. This indicates that it is not possible to encode the current frame at an accuracy level for perceptually transparent coding, i.e., as implied by the initial psychoacoustic model step sizes. It is the function of the rate loop to adjust the step sizes so that bit demand with the modified step sizes is below, and close to, the actual bit allocation. The rate loop operates based on psychoacoustic principles to minimize the perception of excess noise. However, a substantial amount of undercoding, i.e., a noise allocation higher than that suggested by the psychoacoustic model, may be necessary to meet the rate constraints. The undercoding can lead to audible artifacts in the decoded audio output and is particularly noticeable at lower bit rates and for certain types of signals.
Perceptual audio coding techniques such as PAC encoding are particularly attractive for FM band and AM band transmission applications such as in-band digital audio broadcasting (DAB) systems, which are also known as hybrid in-band on-channel (HIBOC), all-digital IBOC and in-band adjacent channel (IBAC)/in-band reserve channel (IBRC) DAB systems. Perceptual audio coding techniques are also well suited for use in other applications, such as satellite DAB systems and Internet DAB systems. Although PAC and other conventional audio coding and decoding techniques often provide adequate performance in single program DAB transmission applications, further improvements are needed for multiple program transmission applications, e.g., multiple-program DAB, satellite DAB, Internet DAB, and other types of multiple program transmission. More particularly, a need exists for improvements in decoding techniques for multiple program transmission applications.
The present invention provides methods and apparatus for decoding in multiple program transmission applications, such as multiple program DAB. In an illustrative embodiment of the invention, a multiple program decoder includes a deinterleaver for deinterleaving information corresponding to a set of frames, using a specified deinterleaving length. A given one of the frames includes information from each of at least a subset of the programs, and the frames are encoded using an outer code, e.g., a CRC code, RS code, BCH code or other type of linear block code, and an inner code, e.g., a convolutional code, turbo code or trellis coded modulation. The multiple program decoder includes an inner code decoder for decoding the inner code over one or more of the programs, and an outer code decoder for decoding the outer code for a selected one of the programs. The multiple program decoder also includes, e.g., a PAC decoder or other suitable program decoder, which decodes the selected program and generates an output signal which can be supplied to an output device, e.g., a speaker or a set of speakers.
In accordance with the invention, the deinterleaving length of the deinterleaver and operating rate of the inner code decoder can be configured such that the multiple program decoder provides substantially instantaneous tuning within a given cluster of programs, or within a set of clusters each including multiple programs. For example, the deinterleaving length may correspond to a cluster of N programs, and the inner code decoder may decode the inner code over a single selected one of the N programs. As another example, the deinterleaving length may correspond to the cluster of N programs, and the inner code decoder may decode the inner code over all of the N programs in the cluster. In this case, separate outer code decoders and program decoders may be used to produce output signals for different ones of the N programs simultaneously. As a further example, the deinterleaving length may correspond to a set of K clusters, each including multiple programs, and the inner code decoder may decode the inner code over all of the programs in each of the K clusters. In this case, all of the programs in the K clusters are available for instantaneous tuning, and separate outer code decoders and programs decoders may be used to produce output signals for any of the programs.
The invention results in low tuning delay relative to conventional implementations without joint deinterleaving, e.g., implementations which deinterleave over one program such that the tuning delay corresponds approximately to the selected deinterleaving length. The invention may be implemented in numerous applications, such as simultaneous multiple program listening and/or recording, simultaneous delivery of audio and data, etc. Although well suited for use with jointly-coded audio programs, the invention does not require joint coding, and can operate with, e.g., independently-coded audio bitstreams. In addition, the invention can be applied to other types of digital information, including, for example, data, video and image information. Alternative embodiments of the invention can utilize other types of outer codes, other types of inner codes, other types of interleaving, e.g., block interleaving, convolutional interleaving or random interleaving, and a wide variety of different frame formats, including TDM, FDM or CDM frame formats, as well as combinations of these and other formats.