Perceptual coders work on the principle of exploiting perceptually relevant information (“PRI”) to reduce the data rate of encoded audio material. Perceptually irrelevant information, information that would not be heard by an individual, is discarded in order to reduce data rate while maintaining listening quality of the encoded audio. These “lossy” perceptual audio encoders are based on a psychoacoustic model of an ideal listener, a “golden ears” standard of normal hearing. To this extent, audio files are intended to be encoded once, and then decoded using a generic decoder to make them suitable for consumption by all. Indeed, this paradigm forms the basis of MP3 encoding, and other similar encoding formats, which revolutionized music file sharing in the 1990's by significantly reducing audio file sizes, ultimately leading to the success of music streaming services today.
PRI estimation generally consists of transforming a sampled window of audio signal into the frequency domain, by for instance, using a fast Fourier transform. Masking thresholds are then obtained using psychoacoustic rules: critical band analysis is performed, noise-like or tone-like regions of the audio signal are determined, thresholding rules for the signal are applied and absolute hearing thresholds are subsequently accounted for. For instance, as part of this masking threshold process, quieter sounds within a similar frequency range to loud sounds are disregarded (e.g. they fall into the quantization noise when there is bit reduction), as well as quieter sounds immediately following loud sounds within a similar frequency range. Additionally, sounds occurring below absolute hearing threshold are removed. Following this, the number of bits required to quantize the spectrum without introducing perceptible quantization error is determined. The result is approximately a ten-fold reduction in file size.
However, the “golden ears” standard, although appropriate for generic dissemination of audio information, fails to take into account the individual hearing capabilities of a listener. Indeed, there are clear, discernable trends of hearing loss with increasing age (see FIG. 1). Although hearing loss typically begins at higher frequencies, listeners who are aware that they have hearing loss do not typically complain about the absence of high frequency sounds. Instead, they report difficulties listening in a noisy environment and in perceiving details in a complex mixture of sounds. In essence, for hearing impaired (HI) individuals, intense sounds more readily mask information with energy at other frequencies—music that was once clear and rich in detail becomes muddled. As hearing deteriorates, the signal-conditioning capabilities of the ear begin to break down, and thus HI listeners need to expend more mental effort to make sense of sounds of interest in complex acoustic scenes (or miss the information entirely). A raised threshold in an audiogram is not merely a reduction in aural sensitivity, but a result of the malfunction of some deeper processes within the auditory system that has implications beyond the detection of faint sounds. To this extent, the perceptually relevant information rate in bits/s, i.e. PRI, which is perceived by a listener with impaired hearing, is reduced relative to that of a normal hearing person due to higher thresholds and greater masking from other components of an audio signal within a given time frame.
However, PRI loss may be partially reversed through the use of digital signal processing (DSP) techniques that reduce masking within an audio signal, such as through the use of multiband compressive systems, commonly used in hearing aids. Moreover, these systems could be more accurately and efficiently parameterized according to the perceptual information transference to the HI listener—an improvement to the fitting techniques currently employed in sound augmentation/personalization algorithms.
Accordingly, it is the object of this invention to provide an improved listening experience on an audio device through better parameterized DSP.