The pure tone audiogram is an individual representation of the minimum audible stimulus threshold. It represents the minimum sound intensity at a particular frequency that a person is able to detect. As such, the audiogram is easy to understand, and recognised as the global standard for diagnosing hearing loss.
Traditional sound personalization methods often rely on linear filtering techniques such as equalization (EQ) that apply compensatory frequency gain according to a user's hearing profile. For example, U.S. Pat. No. 9,748,914B2 discloses a method and apparatus for processing an audio signal, based on boosting or attenuating an input signal at one or more frequencies. Likewise, U.S. Pat. No. 9,680,438B2 describes a method for modifying audio signals in accordance with hearing capabilities of an individual who is listening to audio signals played by a music player. However, the entire application also refers to equalizing techniques. This form of intervention is only applicable to conductive hearing loss, a condition caused by poor energy transfer to the inner ear, specifically deficient conduction of sound energy anywhere along the route through the outer ear, tympanic membrane (eardrum), or middle ear (ossicles). This type of hearing loss is relatively rare and more readily treatable compared to sensorineural hearing loss, which originates in the inner ear. In addition, the human auditory system has been proven to be highly non-linear, and hearing impairment cannot be modelled as a filter as such.
Non-linear amplification is a form of dynamics compression, present i.e. in conventional hearing aids. Conventional hearing aids are designed for use in real world situations where a wide dynamic range of sounds are relevant to the user, i.e. the user wants to make sense of sonic information such as a loud-voiced person speaking in front of them, while at the same time being able to detect the faint sound of a car approaching them from distance while walking down the street. For this reason, the primary function of a hearing aid is to employ wide dynamic range compression (WDC) where the faintest sounds are amplified considerably, but where high-intensity sounds are not. Audio content consumed on mobile devices has very different signal statistics to the sounds that someone will encounter in their daily life, and so a different processing strategy is required to provide the listener with a beneficial sound personalization experience.
The theoretical maximum dynamic range 16-bit of CD-quality audio is approximately 96 dB, designed to cover most of the perceptually relevant intensity range of healthy human hearing. However, this range is rarely achieved in reality due to inefficiencies in the digital-to-analogue conversion process. Trends in techniques employed in sound recording, production, and distribution processes mean that in actuality, almost all digital content consumed by the end user has significantly less dynamic range.
For example, orchestral music, while often cited for its relatively wide dynamic range, typically contains all sonic content within just 40 dB, while rock music is within 20 dB across most of the frequency spectrum. Speech content consumed on mobile platforms, such as voice communications, podcasts, radio is similarly dynamic-range-compressed.
Kirchberger and Russell (2016) tested the impact of conventional hearing aid processing on the perceived quality of such audio content, and concluded that it had a negative effect on the perceived quality of the experience by hearing impaired listeners. The result is in line with expectations, because the signal statistics of the types of audio content likely to be consumed on mobile devices are so different from those designed for a conventional hearing aid.
Given that an EQ is not suitable for the task of sound personalisation based on the hearing profile of an individual, and given that conventional hearing aid processing provides no benefit to hearing impaired listeners when consuming recorded audio content, there is a clear requirement for a novel, targeted class of audio processing. Accordingly, it is the object of the present invention to provide a better quality of experience to (hearing impaired) users when consuming recorded audio content.