1. Technical Field
Within the human-computer interaction community there a growing consensus that traditional WIMP (windows, icons, mouse, and pointer) interfaces need to become more flexible, adaptive, and human-oriented. Simultaneously, technologies such as speech recognition, text-to-speech, video input, and advances in computer graphics are providing increasingly rich tools to construct such user interfaces. These trends are driving growing interest in agent- or character-based user interfaces exhibiting quasi-human appearance and behavior.
2. Background Discussion
One aspect of developing such a capability is the ability of the system to recognize the emotional state and personality of the user and respond appropriately. Research has shown that users respond emotionally to their computers. Emotion and personality are of interest to us primarily because of the ways in which they influence behavior, and precisely because those behaviors are communicative--in human dialogues they establish a channel of social interaction that is crucial to the smoothness and effectiveness of the conversation. In order to be an effective communicant, a computer character needs to respond appropriately to these signals from the user and should produce its own emotional signals that reinforce, rattler than confuse, its intended communication.
There are two crucial issues on the path to what has been termed "affective computing":
(1) providing a mechanism to infer the likely emotional state and personality of the user, and PA1 (2) providing a mechanism to generate behavior in an agent (.e.g. speech and gesture) consistent with a desired personality and emotional state.
A Command and Control Agent:
Imagine a diagnostic session where a user is having trouble printing and an automated, speech-enabled agent is providing assistance. The agent asks a few informational questions and then makes a suggestion "Please try the following. Go to your printer and make sure all cables are plugged in properly and the printer is turned on and is online." The user checks this and returns, replying "No dice, it still doesn't print."Due to the failure of the speech recognition system to recognize "dice", the agent responds "I'm sorry, I did not understand. Please repeat yourself." The user responds, in a some what faster and louder tone, "I said it didn't work! What should I try next?" The agent, noting the speed, volume, intonation, and wording of the utterance now has an increased probability that the user is upset, and a slightly increased belief that the person is a dominant personality. In response, the agent could decide to be either extremely submissive and apologetic for its failings so far, or respond in kind in a terse, confident fashion. The agent chooses the second path. "OK, I'm doing the best I can. Try switching the printer off and back on, and try printing again," it replies, in a somewhat less courteous manner than the previous suggestion.
This dialogue is an example of a command and control interface, in that at each stage there are relatively few alternatives that the agent (or speech recognizer) needs to consider. In the scenario we are considering, at any point the agent need only consider responses to the previous question, as well as a few generic responses (e.g. quit). As we will see, the recognition and generation of alternative phrasings for these speech acts will provide the basis for an affective infrastructure for the agent.
A goal of the present invention is an architecture which is appropriate for a broad range of tasks that are amenable to such command and control interfaces. Such an architecture would not attempt to manipulate the probabilistic characteristics of the language model used by the speech recognition engine, but rather would interpret the various possible rephrasings of a fixed set of alternatives in terms of emotion and personality.
Bayesian Networks Employed in Carrying Out the Invention:
The advent of artificial intelligence within computer science has brought an abundance of decision-support systems. Decision-support systems are computer systems in which decisions, typically rendered by humans, are recommended and sometimes made. In creating decision-support systems, computer scientists seek to provide decisions with the greatest possible accuracy. Thus, computer scientists strive to create decision-support systems that are equivalent to or more accurate than a human expert. Applications of decision-support systems include medical diagnosis, troubleshooting computer networks, or other systems wherein a decision is based upon identifiable criteria.
One of the most promising new areas for research in decision-support systems is Bayesian networks. A Bayesian network is a representation of the probabilistic relationships among distinctions about the world. Each distinction, sometimes called a variable, can take on one of a mutually exclusive and exhaustive set of possible states. A Bayesian network is expressed as an acyclic-directed graph where the variables correspond to nodes and the relationships between the nodes correspond to arcs. A simple example of a Bayesian network can have three variables, X.sub.1, X.sub.2, and X.sub.3, which are represented by three respective nodes with arcs connecting the nodes to reflect the various causal relationships. Associated with each variable in a Bayesian network is a set of probability distributions. Using conditional probability notation, the set of probability distributions for a variable can be denoted by p(x.sub.i .vertline..PI..sub.i, .zeta.), where "p" refers to the probability distribution, where ".PI..sub.i " denotes the parents of variable X.sub.i and where ".zeta." denotes the knowledge of the expert. The Greek letter ".zeta." indicates that the Bayesian network reflects the knowledge of an expert in a given field. Thus, this expression reads as follows: the probability distribution for variable X.sub.i given the parents of X.sub.i and the knowledge of the expert. For example, X.sub.1 is the parent of X.sub.2. The probability distributions specify the strength of the relationships between variables. For instance, if X.sub.1 has two states (true and false), then associated with X.sub.1 is a single probability distribution p(x.sub.1 .vertline..zeta.) and associated with X.sub.2 are two probability distributions p(x.sub.2 .vertline.x.sub.1 =t, .zeta.) and p(x.sub.2 .vertline.x.sub.1 =f, .zeta.)
The arcs in a Bayesian network convey dependence between nodes. When there is an arc between two nodes, the probability distribution of the first node depends upon the value of the second node when the direction of the arc points from the second node to the first node. In this case, the nodes are said to be conditionally dependent. Missing arcs in a Bayesian network convey conditional independencies. For example, two nodes may be conditionally independent given another node. However, two variables indirectly connected through intermediate variables are conditionally dependent given lack of knowledge of the values ("states") of the intermediate variables. Therefore, if the value for the other node is known, the two nodes are conditionally dependent.
In other words, sets of variables X and Y are said to be conditionally independent, given a set of variables Z, if the probability distribution for X given Z does not depend on Y. If Z is empty, however, X and Y are said to be "independent" as opposed to conditionally independent. If X and Y are not conditionally independent, given Z, then X and Y are said to be conditionally dependent given Z.
The variables used for each node may be of different types. Specifically, variables may be of two types: discrete or continuous. A discrete variable is a variable that has a finite or countable number of states, whereas a continuous variable is a variable that has an uncountably infinite number of states. All discrete variables considered in this specification have a finite number of states. An example of a discrete variable is a Boolean variable. Such a variable can assume only one of two states: "true" or "false." An example of a continuous variable is a variable that may assume any real value between -1 and 1. Discrete variables have an associated probability distribution. Continuous variables, however, have an associated probability density function ("density"). Where an event is a set of possible outcomes, the density p(x) for a variable "x" and events "a" and "b" is defined as: ##EQU1## where p(a.ltoreq.x.ltoreq.b) is the probability that x lies between a and b. Conventional systems for generating Bayesian networks cannot use continuous variables in their nodes.
A Bayesian network could be constructed for troubleshooting automobile problems. Such a Bayesian network would contain many variables or nodes relating to whether an automobile will work properly, and arcs connecting the causally related nodes. A few examples of the relationships between the variables follow. For the radio to work properly, there must be battery power. Battery power, in turn, depends upon the battery working properly and a charge. The battery working properly depends upon the battery age. The charge of the battery depends upon the alternator working properly and the fan belt being intact. The battery age variable, whose values Lie from zero to infinity, is an example of a continuous variable that can contain an infinite number of values. However, the battery variable reflecting the correct operations of the battery is a discrete variable being either true or false.
Such an automobile troubleshooting Bayesian network also provides a number of examples of conditional independence and conditional dependence. The nodes operation of the lights and battery power are dependent, and the nodes operation of the lights and operation of the radio are conditionally independent given battery power. However, the operation of the radio and the operation of the lights are conditionally dependent. The concept of conditional dependence and conditional independence can be expressed using conditional probability notation. For example, the operation of the lights is conditionally dependent on battery power and conditionally independent of the radio given the battery power. Therefore, the probability of the lights working properly given both the battery power and the radio is equivalent to the probability of the lights working properly given the battery power alone, P(Lights.vertline.Battery Power, Radio)=P(Lights.vertline.Battery Power). An example of a conditional dependence relationship is the probability of the lights working properly given the battery power which is not equivalent to the probability of the lights working properly given no information. That is, p(Lights.vertline.Battery Power).noteq.p(Lights).
There are two conventional approaches for constructing Bayesian networks. Using the first approach ("the knowledge-based approach"), a person known as a knowledge engineer interviews an expert in a given field to obtain the knowledge of the expert about the field of expertise of the expert. The knowledge engineer and expert first determine the distinctions of the world that are important for decision making in the field of the expert. These distinctions correspond to the variables of the domain of the Bayesian network. The "domain" of a Bayesian network is the set of all variables in the Bayesian network. The knowledge engineer and the expert next determine the dependencies among the variables (the arcs) and the probability distributions that quantify the strengths of the dependencies.
In the second approach ("called the data-based approach"), the knowledge engineer and the expert first determine the variables of the domain. Next, data is accumulated for those variables, and an algorithm is applied that creates a Bayesian network from this data. The accumulated data comes from real world instances of the domain. That is, real world instances of actions and observations in a given field. The current invention can utilize bayesian networks constructed by either or both of these approaches.
After the Bayesian network has been created, the Bayesian network becomes the engine for a decision-support system. The Bayesian network is converted into a computer-readable form, such as a file and input into a computer system. Then, the computer system uses the Bayesian network to determine the probabilities of variable states given observations, determine the benefits of performing tests, and ultimately recommend or render a decision. Consider an example where a decision-support system uses the automobile troubleshooting Bayesian network of the foregoing example to troubleshoot automobile problems. If the engine for an automobile did not start, the decision-based system can calculate the probabilities of all states for all variables in the network. Furthermore, it could request an observation of whether there was gas, whether the fuel pump was in working order by possibly performing a test, whether the fuel line was obstructed, whether the distributor was working, and whether the spark plugs were working. While the observations and tests are being performed, the Bayesian network assists in determining which variable should be observed next, based on identifying that variable that will do the most to reduce the uncertainty (modulo cost) regarding variables of concern.
Such Bayesian networks are examples of the broader class of stochastic models, characterized by using probabilities to link various causal relationships, with which the present invention may be carried out.