Conventional voice-enabled devices are limited to recognizing a single speaker. In case of multiple speakers, conventional devices and techniques only allow for recognizing the first speaker's voice, while rejecting or ignoring the second speaker's voice as an undesired signal.