The Worldwide Web (Web) provides a vast collection of documents that can be accessed via the internet. Many of the documents on the Web include hyperlinks that allow the user to jump to other points within the document, to other documents, and to other resources. A common access method for Web documents is via a computer that provides a visual display of the document and provides for input from the user through a keyboard and a pointing device such as a mouse. The user may use the hyperlinks by selecting them with the pointing device.
Other methods may be used to provide access to Web documents. In particular, voice recognition may be used as an input in lieu of or in addition to a keyboard or pointing device. Voice recognition may allow effective interaction with display-based Web documents where the mouse and keyboard may be missing or inconvenient. This may be useful to people with visual impairments or needing Web access while keeping theirs hands and eyes free for other things.
Voice recognition may require identifying utterances captured from the user by using a speech recognition grammar that defines the valid utterances. The fixed commands of the browser such as “Home” and “Back” are readily identified for inclusion in the speech recognition grammar. The grammar for selecting hyperlinks is not as readily defined as the fixed commands. Speech recognition for selecting hyperlinks in Web documents may differ from other speech recognition requirements because the utterances may be single words or short phrases spoken without a larger context. Some hyperlinks in Web documents may be represented by images or icons rather than text.
The World Wide Web Consortium (W3C) is developing a Voice Extensible Markup Language (VoiceXML) to permit authoring of Web documents intended for use with a Voice Browser that provides an aural presentation and accepts spoken input. VoiceXML documents provide information specifically designed to define the permissible spoken input to be included in the speech recognition grammar.
An extremely large number of Web documents have been authored without consideration of the requirements for selecting hyperlinks by spoken input. It would be desirable to enable a Web browser to respond to spoken utterances to select hyperlinks in Web documents that have not been authored to define the permissible spoken input.