Loudness control systems may be designed to generate an output audio signal with a uniform loudness level from an input audio signal with varying loudness levels. These systems may be used in applications such as audio broadcast chains and in audio playback devices where multiple content sources of varying loudness levels are available. An example goal of the loudness control system may be to automatically provide an output signal with a uniform average loudness level, eliminating the need for a listener to continually adjust the volume control of their playback device.
Related to loudness control systems are automatic gain control (AGC) and dynamic range control (DRC) systems. AGC systems were a precursor to modern loudness control systems and have a long history in communication and broadcast applications, where many early designs were implemented as analog circuits. AGC systems may operate by multiplying an input signal with a time-varying gain signal, where the gain signal is controlled such that an objective measure of the output signal is normalized to a predetermined target level. Objective measures such as, for example, root-mean-square (RMS), peak, amplitude, or energy measures may be used. One drawback of existing AGC designs is that the perceived loudness of the output signal may remain unpredictable. This is due to the psychoacoustic phenomenon that perceived loudness is a subjective measure that only roughly correlates with objective measures such as RMS, peak, amplitude, or energy levels. Thus, while an AGC may adequately control the RMS value of an output signal, it does not necessarily result in the perceived loudness being uniform.
DRC systems are also related to loudness control systems, but with a slightly different goal. A DRC system assumes that the long-term average level of a signal is already normalized to an expected level and attempts to modify only the short-term dynamics. A DRC system may compress the dynamics so that loud events are attenuated and quiet events are amplified. This differs from the goal of a loudness control system to normalize the average loudness level of a signal while preserving the short-term signal dynamics.
Modern loudness control systems attempt to improve upon AGC and DRC designs by incorporating knowledge from the fields of psychoacoustics and loudness perception. Loudness control systems may operate by estimating the perceived loudness of an input signal and controlling the time-varying gain such that the average loudness level of the output signal may be normalized to a predetermined target loudness level.
A problem with existing loudness control systems is that there is no distinction made between desired content and unwanted noise, such that all low-level audio content above a predetermined threshold is amplified. A common problematic signal for existing loudness control systems is speech with moderate background noise. If there is a long pause in the speech, the loudness control system may begin to amplify the background noise. The resulting reduction of the signal-to-noise ratio (SNR) may be objectionable to some listeners. It would be desirable for the loudness control system to avoid relative amplification of noise levels, thus preserving the SNR of the input signal.
Another challenging scenario for loudness control systems is maintaining a uniform average loudness level without adversely limiting intra-content short-term signal dynamics. A system that reacts quickly to loudness changes may consistently achieve a desired target level, but at the expense of reduced short-term signal dynamics. On the other hand, a system that reacts slowly to loudness changes may not effectively control the loudness level, or may exhibit noticeable artifacts such as ramping during large changes in the input signal loudness level. Large long-term loudness changes are most common during inter-content transitions, such as a program transition or a content source change. It would be desirable to address both inter- and intra-content fluctuations differently within a loudness control system such that intra-content short-term signal dynamics are preserved while large inter-content loudness transitions are quickly controlled.