1. Field of Invention
The present invention relates generally to interactive interfaces and, more particularly, to a dynamic interactive voice interface.
2. Related Art
Voice communication devices, such as telephones, traditionally have been used for mere voice communications or for accessing information using touch-tone dialing. With advancements in communications technology, today various types of information can be accessed using voice recognition systems that translate spoken utterances into system commands for data retrieval. Voice recognition systems, typically, include interactive voice interfaces.
An interactive voice interface (also referred to as voice user interface or VUI) is a type of interface that is voice driven. Using particular voice commands, a person can interact with the voice interface in order to browse the content of a web site or access information stored in a database, for example. A VUI provides the communication means between a user and a voice recognition system. A voice recognition system recognizes a user utterance or user request and attempts to service the request.
Many VUIs are implemented to provide the user with certain prompts or interactive voice menus to assist the user to communicate with the voice recognition system. In interacting with the users, many of the current VUIs are rigid, monotonous, repetitious, and basically inhuman. Inasmuch as spoken discourse is a dynamic process, the current VUIs fail to capture the essence of natural conversation. Developing personified, natural language VUIs is an art and a science in itself.
Studies have shown that user interaction with technology is fundamentally social. Thus, social rules should desirably be applied to computer voices. Designers of current VUIs have not fully addressed the social issues revolving human-computer interaction. Therefore, the current VUIs lack the artistic touches that go along with voice acting, voice directing, and audio engineering—factors that should be all considered while developing and implementing a VUI. For example, the current VUIs do not have a well-defined human personality that can interact with a user in a natural conversational style and adapt to the user needs and environment.
Furthermore, spoken discourse is a collaborative process that changes as the conversation unfolds based on the shared knowledge of the participants. Unfortunately, current VUIs are not implemented to remember past interactions with the user and accordingly modify their behavior as expected in natural spoken language. For example, typically the conversational style between two people becomes less formal as the two people become more intimate during the conversation. But the current VUIs fail to adapt their conversational style in a natural way. For example, a VUI continues to repeat the same prompts over and over again, regardless of the number of times a particular user has interacted with the system. This can be impersonal, unhelpful, and irritating.
People interact more positively with a person who communicates so as not to offend other people. This behavior is also expected on a conscious and subconscious level with voices associated with computer applications. Marketing research has shown that providing more user-friendly interactive systems provides greater buying intentions and higher quality reviews. Thus, a voice user interface system is desirable that can incorporate human personality and provide intelligent responses that can assist a user to access needed information. Further, it is desirable for a VUI to develop a more human conversational style and to adapt to changes in a user's speech and experience over time.
The following references provide more detailed information on the topic of human computer interactions and computer generated speech:                1. H. H. Clark, Arenas of language use (1992).        2. L. Karttunen & S. Peters, “Conventional Implications of Montague Grammar,” Berkeley Linguistic Society, 1, 266-278 (1975).        3. D. K. Lewis; Convention: A Philosophical Study (1969).        4. C. Nass & K. M. Lee, In press, “Does computer-generated speech manifest personality? An experimental test of similarity-attraction and consistency,” Journal of Experimental Psychology: Applied.         5. C. Nass et al., “Are Respondents Polite to Computers? Social Desirability and Direct Responses to Computers,” Journal of Applied Social Psychology, 29(5), 1093-1110 (1999).        6. B. Reeves & C. Nass, The Media Equation (1996).        7. S. Schiffer, Meaning (1972).        8. R. C. Stalnaker, Assertion. In P. Cole (ed.) Syntax and Semantics, vol. 9, Pragmatics, 315-332 (1978).        