1. Technical Field
The present disclosure relates to mobile search and more specifically to multimodal interfaces for managing N-best lists of recognition results.
2. Introduction
While numerous research prototypes have been built over the years, building multimodal interfaces remains a complex and highly specialized task. Typically these systems involve a graphical user interface working in concert with a variety of different input and output processing components, such as speech recognition, gesture recognition, natural language understanding, multimodal presentation planning, dialog management, and multimodal integration or fusion. A significant source of complexity in authoring these systems is that communication among components is not standardized and often utilizes ad hoc or proprietary protocols. This makes it difficult or impossible to plug-and-play components from different vendors or research sites and limits the ability of authors to rapidly pull components together to prototype multimodal systems.
The new W3C EMMA standard is one example of how to address this problem by providing a standardized XML representation language for encapsulating and annotating inputs to spoken and multimodal interactive systems. Certain applications on mobile computing platforms could benefit from application of such standardized representations of spoken and multimodal inputs.