This invention relates to perceptually-based coding of audio signals, such as monophonic, stereophonic, or multichannel audio signals, speech, music, or other material intended to be perceived by the human ear.
Demands in the commercial market for increased quality in the reproduction of audio signals have led to investigations of digital techniques which promise the possibility of preserving much of the original signal quality. However, a straight-forward application of conventional digital coding would lead to excessive data rates; so acceptable techniques of data compression are needed.
One signal compression technique, referred to as perceptual coding, employs the idea of distortion or noise masking in which the distortion or noise is masked by the input signal. The masking occurs because of the inability of the human perceptual mechanism to distinguish two signal components (one belonging to the signal and one belonging to the noise) in the same spectral, temporal, or spatial locality under some conditions. An important effect of this limitation is that the perceptibility (or loudness) of noise (e.g., quantizing noise) can be zero even if the objectively measured local signal-to-noise ratio is low. Additional details concerning perceptual coding techniques may be found in N. Jayant et al., xe2x80x9cSignal Compression Based on Models of Human Perception,xe2x80x9d Proceedings of the IEEE, Vol. 81, No. 10, October 1993.
U.S. Pat. No. 5,341,457 discloses a perceptual coding technique in which a perceptual audio encoder is used to convert the audio signal (or a function thereof) into a measure of predictability (e.g., a spectral flatness measure) and then into a tonality metric from which a noise to mask ratio can be calculated, using knowledge provided by controlled subjective testing of the masking properties of tones and noise. Other techniques calculate the tonality metric from a loudness or loudness uncertainty calculation. These known perceptual coding techniques are either computationally inefficient, provide incorrect noise to mask ratios for some kinds of audio signal, or both.
Accordingly, it is desirable to provide a perceptual coding technique that reduces the complexity of the required computations while increasing the accuracy of the resulting noise to mask ratios.
The inventor has determined that accurate perceptual coding does not require a measure of tonality. Rather, perceptual coding is accomplished by measuring the envelope roughness of the filtered audio signal, which may be directly converted to the noise to mask threshold needed to calculate the perceptual threshold or xe2x80x9cjust noticeable differencexe2x80x9d. Thus, the present invention does not require any complex calculations to determine tonality, either by a measure of predictability or by the calculation of a loudness or loudness uncertainty. Instead, the envelope roughness of the signal is simply reduced directly to the noise to mask ratio.