Dialog systems are continually evolving to handle less constrained spoken input, interpret user intent, and engage in natural dialog to accomplish complex tasks. Addressee detection is used in spoken dialog systems to detect whether or not user speech is directed toward the system. In single-user human-computer (H-C) contexts, the alternate addressee may be the user (self-talk), or others in the environment who are not interacting with the system. When multiple users interact jointly with a system (H-H-C dialog), addressee detection becomes even more of a challenge. Human-human (H-H) conversation about the shared task may contain the same keywords a system would listen for. When system-addressed utterances contain more than only commands or keywords, word sequences can begin to look more like those in H-H speech. Other cues such as gaze may also become less reliable. For example, when the users are looking at a system display while talking with each other.