Ducking is an audio effect commonly used for mixing multiple different types of sounds from multiple sources. In ducking, the level of one audio signal is reduced by the presence of another signal. This is typically achieved by lowering or “ducking” the volume of a secondary (slave) audio track when the primary (master) track starts, and lifting the volume again when the primary track is finished. An example use of this effect is for creating a voice-over by a professional speaker reading the translation of a foreign language original dialogue. Ducking becomes active as soon as the translation starts. The ducking effect can also be applied in more sophisticated ways, where a signal's volume is delicately lowered by another signal's presence. One track is made quieter (the ducked track, or the slave) so another track (the ducking track, or the master) can be heard.
Most audio systems perform audio ducking by attenuating the volume of the slave track without considering how loud the slave track already is. This is problematic when the slave is either too quiet or too loud in relation to the master. For a slave that is already too quiet, lowering its volume may make the slave nearly or completely inaudible. On the other hand, lowering the volume of a slave that is far too loud may not be enough to bring the slave's volume down to a desired level in relation to the master.
A common way to perform machine/software assisted ducking is to use an audio compressor to attenuate the volume of the slaves. The user sets a threshold level above which the compressor starts attenuating the incoming signal. Once the incoming signal falls below that threshold the attenuation is removed. How quickly the compressor reacts to a signal rising above the threshold and how quickly it returns to the original signal's level once it falls below the threshold is determined by “attack” and “release” controls. Adjusting these controls requires manual intervention that can be quite difficult to master, and is very signal dependent. For example, quick transients in the signal that briefly rise above the threshold might trigger the attenuation even though their duration is such that they aren't actually perceived as loud to human ears. A user of such an audio system therefore must painstakingly tweak the audio ducking parameters in order to arrive at desired loudness levels for the master and the slave.
What is needed is an audio system that intelligently ducks the loudness of each slave when performing audio ducking operations. Such a system should not be affected by short term transients in audio signals, and should attenuate audio signals based on perceptible differences in loudness to human ears rather than on imperceptible changes in audio volumes.