Current cognitive computing systems can include a virtual agent. The virtual agent can appear to a user on a display, for example, such that a life like human image (e.g., a 3D animation of a human) and/or an avatar is displayed to the user. Virtual agents can be displayed to a user on a display of a computing device (e.g., computer, tablet, smart phone, etc.,) and can talk to the user via an audio output of the computing device. Typically, if the virtual agent outputs utterances to the user, the virtual agent can appear to the user as if it is talking (e.g., lips are moving, face is moving) and corresponding behavior (e.g., facial expressions, body gesture and/or voice fluctuations) can be displayed/output.
In current systems that include virtual agents displayed to a user, the behavior of the virtual agent is typically pre-assigned to the virtual agent's utterances. For example, for a virtual agent that utters “how may I help you”, a smiling behavior may be assigned to that utterance. The behavior is typically specified in accordance with standard Behavior Markup Language (BML).
One difficulty with assigning behaviors to utterances is that the same utterance can mean different things to a user depending upon the context of the conversation.
Another difficulty with assigning behavior to utterances is that the behavior may be inappropriate for the context of the conversation. For example, at the start of a dialogue the virtual agent may with a smiling face utter to a user “how may I help you?” However, if during the dialogue the user expresses dissatisfaction, current virtual agents typically still utter “how may I help you?” with a smiling face, when a more serious, less smiling expression is likely more appropriate.
Another difficulty with current systems and methods for non-verbal output of a virtual agent is that if the virtual agent outputs an utterance to a user that does not have corresponding behavior specified, the virtual agent's behavior may not match the utterance.
Another difficulty with current systems is that they typically determine content of the user's utterance based solely on the words in the utterance. For example, a user may utter “everything is just perfect.” Analysis of the content of this statement based on the words can result in a conclusion that the user is happy. However, if the user's tone and facial expression indicate sarcasm, or if the state of the conversation shows a failure in managing the conversation or a different emotional or satisfaction state for the user, the user can actually mean the exact opposite of the content of the words they are uttering. Thus, current methods for determining a meaning of the user's utterance can be flawed.
Therefore, it can be desirable for a virtual agent to have behavior that corresponds to a natural language conversation between a user and a virtual agent, and that can change based on content, emotion, mood and/or personality of a user and/or virtual agent during the dialogue.