An audio signal can be described by its spectral balance or frequency response. When it is played on a playback device, the audio signal has an associated sound pressure level, or “SPL”. These two properties of an audio signal are logically independent: assuming a linear, time invariant reproduction system, changing an audio signal's sound pressure level should not affect any objective measurement of the spectral balance of that signal.
However, from a subjective, psychoacoustic perspective, we observe that a change in sound pressure level yields significant changes on the perceived spectral balance of the signal. This is because the sensitivity of the human ear to differences in sound pressure level changes as a function of frequency. For example, when we lower the sound pressure level of an audio signal, the perceived loudness of low frequencies decreases at a much higher rate than for mid range frequencies.
This phenomenon may be described by equal loudness curves. FIG. 1 shows equal loudness curves defined by ISO standard 226 (2003). Loudness is measured in units of phons, where 1 phon is defined as 1 decibel (dB) of sound pressure level (SPL) at a frequency of 1000 Hz (1 kHz). Each curve in FIG. 1 represents the SPL required to provide a consistent loudness level across frequency, as would be perceived by an ‘average’ individual. FIG. 1 illustrates six such curves that model perceived loudness levels from the human hearing threshold up to 100 phons in 20-phon increments. Note that, in accordance with the definition of the phon, 20 phons of loudness require 20 dB of SPL at 1 kHz, 40 phons of loudness require 40 dB of SPL at 1 kHz, and so on.
Loudness perception can also vary between people due to environmental and physical attributes such as age-related hearing loss, also known as presbycusis. The increased attenuation with age for an ‘average’ person is shown in FIG. 2, which is adapted from data contained in ISO standard 7029 (2000). The baseline attenuation is the hearing of a twenty year old average individual, represented by a straight line at 0 dB attenuation. As can be seen from FIG. 2, an average thirty year old person has only slightly worse hearing than a twenty year old, above approximately 1800 Hz. By contrast, an average sixty year old person has markedly decreased hearing (over 20 dB hearing loss) for frequencies above 1000 Hz. Thus, presbycusis is especially problematic in the higher audible frequencies, and is highly age-dependent.
Often, a listener will attempt to counteract a perceived loss in balance in high and low frequencies by applying an equalization function (“EQ”) to their audio output. In the past, this EQ function was often applied using a graphic equalizer that boosted low and high frequencies, yielding the shape of a smile on octave band spaced sliders. While the “smiley-face” EQ does a good job of filling out the perceived spectrum at lower listening levels, it is generally applied independent of sound pressure level. Therefore, at higher sound pressure levels, the resulting equalized sound track can be perceived as being too bass heavy at low frequencies and too shrill at higher frequencies.
Finally, audio that has been aggressively compressed using perceptual coding techniques for low bit rates (e.g. mp3) may be perceived to be less bright or muffled as a result of the encoding process. This is often because the higher frequencies have been filtered out to save bandwidth. Applying a high frequency EQ will not help in this situation since the audio is simply not present in the higher frequency bands.
The above-mentioned problems relating to spectral perceived spectral balance of an audio signal played at lower level can be summarized as follows:
The sensitivity of the human ear to differences in sound pressure level changes as a function of frequency yielding a perceived spectral imbalance at lower listening levels.
Age-related hearing loss yields a perception of quieter high frequency content.
While application of a “smiley-face” EQ curve can help correct the perceived spectral balance at lower listening levels, it may also over-compensate at higher listening levels (when less compensation is required).
Lower bit-rate perceptual audio coding can yield the perception of muffled audio.
Applying any kind of high frequency EQ may not be capable of brightening low bit rate encoded material.