1. Technical Field
This invention relates to browsing network-based electronic content and more particularly to a method and apparatus for coupling a visual browser to a voice browser.
2. Description of the Related Art
Visual Browsers are applications which facilitate visual access to network-based electronic content provided in a computer communications network. One type of Visual Browser, the Web Browser, is useful for locating and displaying network-based electronic content formatted using HyperText Markup Language (“HTML”). Two popular Web Browsers are Netscape® Navigator® and Microsoft® Internet Explorer®. Notably, the term “Visual Browser” denotes that the browser can display graphics, text or a combination of graphics and text. In addition, most Visual Browsers can present multimedia information, including sound and video, although some Visual Browsers can require plug-ins in order to support particular multimedia information formats.
Whereas typical Visual Browsers operate in the desktop environment, compressed HTML (“C-HTML”) Visual Browsers have emerged for processing HTML formatted documents in low-bandwidth environments. Specifically, C-HTML formatted documents are HTML formatted documents which have been compressed prior to transmission. C-HTML compliant Visual Browsers can decompress C-HTML formatted documents prior to displaying the same. Exemplary C-HTML Visual Browsers have been implemented for the QNX® Neutrino® operating system manufactured by QNX Software Systems, Ltd. of Kanata, Ontario.
A Voice Browser, unlike a Visual Browser, does not permit a user to interact with network-based electronic content visually. Rather, a Voice Browser, which can operate in conjunction with a Speech Recognition Engine and Speech Synthesis Engine, can permit the user to interact with network-based electronic content audibly. That is, the user can provide voice commands to navigate from network-based electronic document to document. Likewise, network-based electronic content can be presented to the user audibly, typically in the form of synthesized speech. Thus, Voice Browsers can provide voice access and interactive voice response to network-based electronic content and applications, for instance by telephone, personal digital assistant, or desktop computer.
Significantly, Voice Browsers can be configured to interact with network-based electronic content encoded in VoiceXML. VoiceXML is a markup language for distributed voice applications based on extended markup language (“XML”), much as HTML is a markup language for distributed visual applications. VoiceXML is designed for creating audio dialogs that feature synthesized speech, digitized audio, recognition of spoken and Dual Tone Multifrequency (“DTMF”) key input, recording of spoken input, telephony, and mixed-initiative conversations. Version 1.0 of the VoiceXML specification has been published by the VoiceXML Forum in the document Linda Boyer, Peter Danielsen, Jim Ferrans, Gerald Karam, David Ladd, Bruce Lucas and Kenneth Rehor, Voice eXtensible Markup Language (VoiceXML™) version 1.0, (W3C May 2000), incorporated herein by reference. Additionally, Version 1.0 of the VoiceXML specification has been submitted to and accepted by the World Wide Web Consortium by the VoiceXML Forum as a proposed industry standard.
Notably, the capabilities of Visual Browsers have not been combined with the capabilities of Voice Browsers such that a user of both can interact with network-based electronic content concurrently. That is, to date no solution has been provided which permits a user to interact with network-based visual content in a Visual Browser while also interacting with network-based audio content in a Voice Browser. Present efforts to provide a browser which can interact with network-based visual and audio content have been confined to the coding of speech synthesis functionality into an existing Visual Browser to produce a speech-aware Visual Browser. In addition, new speech-related markup tags for Visual Browsers have been proposed in order to provide speech functionality to a Visual Browser.
Still, these solutions require the implementer to develop a speech-aware function set for handling network-based speech content and to integrate the same directly in the source code of the Visual Browser. In consequence, the development of speech-related functionality is tightly linked to the development of the remaining functionality of the Visual Browser. Finally, the tight integration between the Visual Browser and the speech-aware functionality precludes the user from using a separate, more robust and efficient Voice Browser having a set of functions useful for interacting with network-based speech content. Hence, what is needed is a method and apparatus for coupling a visual browser to a voice browser so that the combination of the visual browser and the voice browser can perform concurrent visual and voice browsing of network-based electronic content.