Conventional voice interaction systems may interrupt a user in a conversation which is considered inappropriate among human being. In other situations, voice interaction systems may speak (or provide a voice output) during a busy driving situation. Essentially, such voice interaction systems speak when it is not desirable, or when they are not supposed to, or in a completely inappropriate tone. In the automotive setting, such “inconsiderate behavior” of the voice interaction system leads to drivers turning these systems off, or ignoring the output prompts as they are not context relevant. Furthermore, such voice interaction systems appear robotic, artificial (or non-human), and are therefore not enjoyable and artificial. This is unfortunate as voice interaction systems are created to mimic human to human interaction, but they are unsuccessful at it.
Conventional voice interaction systems may not adjust to the user's context. For example, a voice interaction system in a vehicle does not change it tone or withholds speaking altogether if the driver is in a busy driving situation. In conventional voice interaction systems, the same voice (speech, rate, tone, etc.) is used across all voice interaction use cases, and is generally the same for both unimportant and important audio output information to the user and is also the same for timely and not so timely information, etc. Moreover, conventional voice interaction systems are not context aware, therefore such systems do not understand when to interject in the conversation that the driver may be having with a passenger or when to avoid disturbing the conversation.