The present disclosure relates to editing audio signals.
Audio signals including audio data can be provided by a multitude of audio sources. Examples include audio signals from an FM radio receiver, a compact disc drive playing an audio CD, a microphone, or audio circuitry of a personal computer (e.g., during playback of an audio file).
When audio signals are provided using microphones, one or more of the microphones are usually associated with particular audio signals, e.g., a musician playing an instrument in an orchestra or a person singing in a band. Additionally, the number of microphones used to capture particular audio signals can be high. In such a setting, it is not uncommon to collect audio signals using microphones from thirty or more sources. For example, a drum set alone may require five or more microphones. Individual groups of instruments can have one or more microphones in common (e.g., in an orchestral setting). Additionally, single instruments are often exclusively associated with one or more microphones.
Audio sources, regardless of the way the audio signals are provided (i.e., whether providing signals using microphones or not), provide signals including audio data identifying different audio properties. Examples of audio properties include signal intensity, signal kind (e.g., stereo, mono), stereo width, and phase (or phase correlation, e.g., of a stereo signal).
The process of modifying the properties of multiple audio signals in relation to each other, in relation to other audio signals, or combining audio signals is referred to as mixing. A device for such a purpose is referred to as a mixer or an audio mixer. A particular state of the mixer denoting the relationship of multiple audio signals is typically referred to as a mix.
Masking is a psychoacoustic phenomenon where perception of one audio signal is reduced or prevented because of the presence of another audio signal. Masking can depend both on the intensity of the audio signals relative to each other and the frequencies of the audio signals relative to each other. Thus, an audio signal at a particular frequency and intensity can be masked by another audio signal at the same frequency but higher intensity. For example, a particular narration signal can be mixed with a background music signal. However, when the two signals are mixed, the background music can mask regions of the narration.
One technique for reducing masking is side-chain compression, also referred to as “ducking”. In side-chain compression, a primary audio signal is provided as a side-chain input to a compressor. If the intensity of the primary audio signal exceeds a specified threshold intensity the compressor attenuates another secondary signal, typically by an amount proportional to the amount the threshold was exceeded for the duration the signal exceeds the threshold. Side-chain compression is, therefore, generally based only on the overall intensity of the primary signal across all frequencies and without consideration of the audio properties of the secondary signal.