In many late model consumer electronics devices such as desktop computers, laptop computers, smartphones, tablet computers, and intelligent personal assistant devices (e.g., intelligent loudspeakers), there are multiple sound pickup channels in the form of two or more acoustic microphones. The microphones produce mixed audio signals, in that they contain sounds from various or diverse sources in the acoustic environment, an extreme example of which is where there are two or more talkers in a room along with background noise (e.g., air conditioner noise) and a media content playback device, during a group conference call. The media content playback device has a loudspeaker that is producing for example the voice of a far end talker during a telephony call, or music or dialog from a podcast, which is also picked up by the microphones (where the latter may be in the same housing as the media content playback device.) In some instances, there may be several intelligent loudspeaker units in the same room, which may lead to further acoustic coupling due to the playback from one loudspeaker unit being picked up by the microphones of another loudspeaker unit. There are thus a variety of acoustic environment conditions, which disturb an otherwise clean speech signal that is picked up by a microphone (as for example the voice of a near end talker in the room.) This hinders real-time applications such as voice trigger phrase detection, hands free telephony, and automatic speech recognition that may be performed upon the speech signal.