Audio extraction problems are conventionally addressed using techniques such as single microphone stationary noise suppression or multi-microphone setups for gradually removing undesired signals from a target signal. The extracted desired signal is used for speech enhancement, speech recognition, audio transcription, and many other audio-based applications.
However, for these conventional methodologies to succeed, special static sensor placement and assumptions about input signal-to-noise ratios must be met. Thus, for many new use cases in the wearable, automotive, and other cutting-edge spaces where static sensor placement may not be possible due to distributed arrangements and/or spatially uncertain allocation with respect to the target sources, removing undesired signals from a target signal is beyond the abilities of typical audio extraction techniques.