This invention relates in general to digital signal processing of audio signals, such as music signals. More particularly, the invention relates to the implementation of a high quality dual-channel digital audio encoder, based on the psychoacoustic model of the human auditory system, for digital storage or transmission.
In order to more efficiently broadcast or record audio signals, the amount of information required to represent the audio signals may be reduced. In the case of digital audio signals, the amount of digital information needed to accurately reproduce the original pulse code modulation (PCM) samples may be reduced by applying a digital compression algorithm, resulting in a digitally compressed representation of the original signal. The goal of the digital compression algorithm is to produce a digital representation of an audio signal which, when decoded and reproduced, sounds the same as the original signal, while using a minimum of digital information for the compressed or encoded representation.
Recently, the use of psychoacoustic models in the design of audio coders has led to high compression ratios while keeping audible degradation in the compressed signal to a minimum. Description of one such method can be found in the Advanced Television Systems Committee (ATSC) Standard document entitled xe2x80x9cDigital Audio Compression (AC-3) Standardxe2x80x9d, Document A/52, Dec. 20, 1995. In the basic approach, the time domain signal is first converted to frequency domain using a bank of filters. Frequency domain masking of human auditory system is then exploited to maximize perceived fidelity of the signal transmitted at a given bit-rate.
Further compression can be successively obtained by use of a well known technique called coupling. Coupling takes advantage of the way the human ear determines directionality for very high frequency signals, in order to allow a reduction in the amount of data necessary to code an audio signal. At high audio frequency (approximately above 2 kHz) the ear is physically unable to detect individual cycles of an audio waveform, and instead responds to the envelope of the waveform. Consequently, the coder combines the high frequency coefficients of the individual channels to form a common coupling channel. The original channels combined to form the said coupling channel are referred to as coupled channels.
A basic encoder can form the coupling channel by simply taking the average of all the individual channel coefficients. A more sophisticated encoder can alter the sign of individual channels before adding them into the sum so as to avoid phase cancellations.
The generated coupling channel is next sectioned into a number of frequency sub-bands. Frequency sub-bands are grouped together to form coupling bands. For each such band and each coupled channel a coupling co-ordinate is transmitted to the decoder. To obtain the high frequency coefficients in any coupling frequency band, for a particular coupled channel, from the coupling channel, the decoder multiplies the coupling channel coefficients in that coupling frequency band by the coupling co-ordinate of that channel for that particular coupling frequency band. For a dual channel implementation of such a decoder, a phase flag bit may also be provided for each coupled band of the coupling channel. A final step of phase-correction is then performed, by the decoder, in which the coefficients in each band are multiplied by the phase flag bit for that band.
As mentioned, a basic encoder can form the coupling channel by simple addition of all the individual channel coefficients, or a more sophisticated encoder may alter the sign of individual channels before addition in order to reduce phase cancellation effects. But no existing single coupling strategy for the entire frequency spectrum of the coupling channel is adequate to ensure minimal information loss due to phase cancellation. The criteria for combining sub-bands to form larger bands is not well understood and most developers usually use a pre-specified banding structure to group the sub-bands into bands. Phase flag computation is performed as an additional step, thereby requiring extra computation. The standard also does not outline any specific method for determination of the phase flag bits. Ad hoc methods do exist but, due to their very nature, do not guarantee any assured performance, and cannot be relied upon to provide minimum error between the original coefficients at encoder and the reconstructed, phase corrected, coefficients at the decoder.
In accordance with the disclosed embodiments of the present invention, there is provided a method for computing coupling parameters in a digital audio encoder wherein frequency coefficient sub-bands in a coupling frequency range are arranged into coupling bands, including the steps of determining a power value for each sub-band in the frequency range, selecting a coupling scheme for each sub-band based on the corresponding power value, and grouping adjacent sub-bands having the same selected coupling scheme into coupling bands.
Preferably the first or second coupling scheme is selected for each sub-band according to whether the power value is positive or negative.
Preferably the power valve comprises a correlation computation of frequency coefficients from first and second audio channels.
In one form of the invention each sub-band includes at least one frequency coefficient from each of first and second audio channels, wherein the power value for a particular sub-band is determined by summing the product of corresponding frequency coefficients from the first and second channels.
Advantageously, the method may include determining a phase reconstruction value for each coupling band on the basis of the coupling scheme selected for the sub-bands in that band.
In accordance with another embodiment of the invention, there is also provided a method for generating coupling parameters in a dual channel audio encoder wherein frequency coefficient sub-bands in a coupling frequency range are arranged into coupling bands, each sub-band including at least one frequency coefficient from each of first and second audio channels, including the steps of:
i) receiving frequency transform coefficients for the first and second audio channels;
ii) computing a power value for each sub-band;
iii) for each sub-band, selecting a coupling coefficient generation scheme from first and second schemes on the basis of the computed power value for that sub-band;
iv) for each sub-band, generating a coupling coefficient according to the selected one of the first and second schemes for that sub-band; and
v) forming bands from adjacent sub-bands having the same selected scheme.
Preferably the power value comprises a correlation computation of frequency transform coefficients of the first and second channels.
In a preferred implementation, the power value for a said sub-band is computed according to:
P=xcexa3i(ai*bi)
where
P is the power value,
ai are frequency coefficients from the first audio channel,
bi are frequency coefficients from the second audio channel, and
index i corresponds to frequency coefficients extending over the range of the said sub-band.
Preferably, if the power value, P, is greater then zero the coupling coefficients for the sub-band are generated according to:
ci=(ai+bi)/2
where ci are the coupling coefficients; and if the power value, P, is less than zero the coupling coefficients for the sub-band are generated according to:
ci=(aixe2x88x92bi)/2.
The method may advantageously include generating a phase flag for the coupling band on the basis of the power value or coupling coefficient generation scheme for the sub-bands in that coupling band. In a particular form of the invention the phase flag for a particular coupling band is +1 if the power value, P, for each of the constituent sub-bands is greater than zero, and the phase flag is xe2x88x921 if the power value, P, for each of the constituent sub-bands is less than zero.
Another embodiment of the invention also provides a dual channel digital audio encoding apparatus wherein audio data frequency coefficients for first and second channels in a coupling frequency range are arranged in a sequence of sub-bands, comprising:
a correlation computation processor for computing a power value for each sub-band;
a coupling coefficient generator for generating a sequence of sub-band coupling coefficients using a coupling coefficient generation scheme selected for each sub-band on the basis of the corresponding power value for that sub-band; and
a band structure processor for arranging adjacent coupling coefficient sub-bands in the sequence generated with the same generation scheme into bands.
The audio encoding apparatus preferably includes a phase estimation processor for generating a phase flag for each band on the basis of the generation scheme used for the constituent sub-bands of the corresponding band. The apparatus may further include a coupling coordinate generator for generating coupling coordinates for the first and second channels for each said band.
A sub-band based coupling channel generation strategy can ensure minimal information loss due to phase cancellation of the two coupled channels. The sub-band based coupling strategy may be considered as a generalisation of a channel based strategy. If the coupling strategy remains the same for all bands, the band based coupling strategy is equivalent to the channel based coupling strategy.
Even when exact phase cancellation does not occur, error due to phase cancellation can be noticeably large in cases where an appreciable phase lag exists between the two channels. The methods of the preferred embodiments of this invention use power calculations to minimize signal information loss due to coupling. Further, the coupling strategy of of the sub-bands is used to determine the banding structure. This approach reduces the phase estimation for a band to a simple decision, requiring no extra computation.