Research in signal processing systems, methods and algorithms for suppressing or removing noise components of a noise infected target signal, such as a voice or speech signal, has been on-going for decades. Important objectives of these efforts have been, and still are, to provide an improvement in the perceived sound quality and/or speech intelligibility for the listener. In voice communication apparatuses and systems it is known to represent a noisy speech signal in a time-frequency domain, e.g. as multiple sub-band signals. In many cases it is desirable to apply a frequency dependent complex valued beamforming coefficients, performing a linear combination in the complex domain of the sub-band signals derived from first and second digital audio signals, before a noise reduced output signal is reconstructed as a real valued time domain signal. This is carried out to attenuate the undesired noise signal components that may be present in the target signal. These frequency dependent beamformer coefficient values are sometimes derived from an estimate of the time-frequency dependent ratio of target signal and noise signal.
Voice activation is an area that is receiving more attention today than ever because portable devices such as mobile telephones, smartphones, audio-enabled tablets, audio/video conferencing systems, hands-free systems, television sets, and more, have gained so much signal processing power that voice-activated convenience functions such as hands-free operation can be included in many of these devices. Voice activation and speaker identification systems may rely on the recognition of a target or trigger word, phrase or utterance of an incoming sound signal. Voice activation systems are generally demanding in terms of signal processing resources and accordingly also in terms of power or energy consumption. Devices that have relatively limited power or energy available, such as numerous types of portable battery powered communication devices, typically present a problem for integration of suitable voice-activated control systems and methods.
The perceived quality and intelligibility of the noise-reduced output signals produced by the microphone array processing system are of great importance for the convenience of human users and for the accuracy and success of automatic speech recognition systems, automatic voice-activated system, speaker verification or identification systems etc.