The present invention relates generally to systems and methods for synchronizing interactions between multi-modal user interfaces (UI) and applications and, more particularly, to system and methods for managing information exchanges between mono-mode applications having different modalities and between different modes of a multi-modal application.
The computing world is presently evolving towards an era where billions of interconnected pervasive clients communicate with powerful information servers. Indeed, the coming millennium will be characterized by the availability of multiple information devices that make ubiquitous information access an accepted fact of life. The evolution of the computer world towards billions of pervasive devices interconnected via the Internet, wireless networks or spontaneous networks (such as Bluetooth and Jini) will revolutionize the principles underlying man-machine interaction. This evolution will mean that soon, personal information devices will offer ubiquitous access, bringing with them the ability to create, manipulate and exchange any information anywhere and anytime using interaction modalities most suited to the user's current needs and abilities. Such devices will include familiar access devices such as conventional telephones, cell phone, smart phone, pocket organizers, PDAs and PCs, which vary widely in the interface peripherals they use to communicate with the user. At the same time, as this evolution progresses, users will demand a consistent look, sound and feel in the user experience provided by these various information devices.
The increasing availability of information, along with the rise in the computational power available to each user to manipulate this information, brings with it a concomitant need to increase the bandwidth of man-machine communication. Users will come to demand multi-modal interaction in order to maximize their interaction with information devices in hands-free, eyes-free environments. In addition, the availability of a plethora of information devices will encourage multiple parallel interactions with electronic information akin to what users expect today in the world of traditional human-intermediated information interchange. Realizing these goals will require fundamental changes in the user interface, lacking this, users will be unable to access, act on, and transform information independently of the access device.
Information being manipulated via such devices might be located on the local device or accessible from a remote server via the network using open, interoperable protocols and standards. Usage of such open standards also leads to a seamless integration across multiple networks and multiple information sources such as an individual's personal information, corporate information available on private networks, and public information accessible via the global Internet. This availability of a unified information source will define productivity applications and tools of the future. Indeed, users will increasingly interact with electronic information, as opposed to interacting with platform-specific software applications as is currently done in the world of the desktop PC.
Information-centric computing carried out over a plethora of multi-modal information devices will be essentially conversational in nature and will foster an explosion of conversational devices and applications. This trend towards pervasive computing goes hand-in-hand with the miniaturization of the devices and the dramatic increases in their capabilities.
With the pervasiveness of computing causing information appliances to merge into the users environment, the user's mental model of these devices is likely to undergo a drastic shift. Today, users regard computing as an activity that is performed at a single device like the PC. As information appliances abound, user interaction with these multiple devices needs to be grounded on a different set of abstractions. The most intuitive and effective user model for such interaction will be based on what users are already familiar with in today's world of human-intermediated information interchange, where information transactions are modeled as a conversation amongst the various participants in the conversation. It is to be noted that the term “conversation” is used to mean more than speech interaction. Indeed, the term “conversation” is used to encompass all forms of information interchange, where such interchange is typically embodied by one participant posing a request that is fulfilled by one or more participants in the conversational interchange.
Because such conversational interactions will include devices with varying I/O capabilities, ranging from the ubiquitous telephone characterized by speech-only access to personal organizers with limited visual displays, traditional GUI-based desktop PC clients will be at a significant disadvantage; the user interface presented by such software maps poorly if at all to the more varied and constrained interaction environments presented by information appliances. Moreover, pervasive clients are more often deployed in mobile environments where hands-free or eyes-free interactions are desirable. Accordingly, conversational computing will become indispensable in the near future. Conversational computing is inherently multi-modal and often expected to be distributed over a network.
Thus, conversational computing also defines an inflection point in personal information processing and is likely to lead to a revolution in all aspects of computing more significant than what was observed in the transition from mainframe based computing to graphical workstations in the mid-1980's.
The ability to access information via a multiplicity of appliances, each designed to suit the user's specific needs and abilities at any given time, necessarily means that these interactions will exploit all available input and output modalities to maximize the bandwidth of man-machine communication.
Accordingly, a system and method that provides coordinated, synchronized, multi-modal user interaction for user interfaces that work across these multiplicity of information appliances is highly desirable. Indeed, such a system and method should allow a user to interact in parallel with the same information via a multiplicity of appliances and user interfaces, with a unified, synchronized view of information across the various appliances that the user deploys to interact with information.