The World Wide Web, or simply “the Web”, is comprised of a large and continuously growing number of Web pages which are accessible over a publically-available network. Clients request information, e.g., Web pages, from Web servers; using the Hypertext Transfer Protocol (“HTTP”). HTTP is a protocol which provides users access to files including text, graphics, images, and sound using a mark-up language that forms a page description language known as the Hypertext Markup Language (“HTML”). HTML provides document formatting allowing the developer to specify the look of the document and/or links to other servers in the network. A Uniform Resource Locator (URL) defines the path to Web site hosted by a particular Web server.
The pages of Web sites are typically accessed using an HTML-compatible browser (e.g., Netscape Navigator or Internet Explorer) executing on a client machine. The browser specifies a link to a Web server and particular Web page using a URL. When the user of the browser specifies a link via a URL, the client issues a request to a domain name server (DNS) to map a hostname in the URL to a particular network IP address at which the server is located. The DNS returns a list of one or more IP addresses that can respond to the request. Using one of the IP addresses, the browser establishes a connection to a Web server. If the Web server is available, it returns a document or other object from the web server formatted according to HTML.
As Web browsers become the primary interface for access to many network and server services, Web applications in the future will need to interact with many different types of client machines including, for example, conventional personal computers and recently developed “thin” clients. Thin clients can range between 60 inch TV screens to handheld mobile devices. This large range of devices creates a need to customize the display of Web page information based upon the characteristics of the graphical user interface (“GUI”) of the client device requesting such information. Conventional technology often uses HTML pages or scripts that are customized to handle the GUI and navigation requirements of each of these clients.
Client devices having different display capabilities, which may be monochrome, color, different color palettes, resolution, sizes. Such devices may also have multiple peripheral devices that may be used to provide input signals or commands (e.g., mouse and keyboard, touch sensor, remote control for a TV set-top box). Furthermore, the browsers executing on such client devices can support different languages (e.g., HTML, dynamic HTML, XML, Java, JavaScript). These differences may cause the experience of browsing the same Web page to differ dramatically, depending on the type of client device employed.
The inability to adjust the display of Web pages based upon a client's capabilities and environment may cause certain drawbacks. For example, a Web site may simply be incapable of servicing a particular set of clients, or may make the Web browsing experience confusing or unsatisfactory in some way. Even if the developers of a Web site have made an effort to accommodate a range of client devices, the code for the Web site may need to be duplicated for each client environment. Duplicated code consequently increases the maintenance cost for the Web site. In addition, entirely different URLs may be needed to access the Web pages formatted for specific types of client devices.
Content from Web pages has been generally been inaccessible to those users not having a personal computer or other hardware device similarly capable of displaying Web content. Even if a user possesses such a personal computer or other device, the user needs to have access to a connection to the Internet. In addition, those users having poor vision or reading skills are likely to experience difficulties in reading text-based Web pages.
For these reasons, efforts have been made to develop Web browsers for facilitating non-visual access to Web pages for users that wish to access Web-based information or services through a telephone. Such non-visual Web browsers, or “voice browsers”, or “speech browsers”, present audio output to a user by converting the text of Web pages to speech and by playing pre-recorded Web audio files from the Web. A voice browser also permits a user to navigate between Web pages by following hypertext links, as well as to choose from a number of pre-defined links, or “bookmarks” to selected Web pages. In addition, certain voice browsers permit users to pause and resume the audio output by the browser.
The Voice eXtensible Markup Language (“VoiceXML”) is a markup language developed specifically for voice applications useable over the Web, and is described at http://www.voicexml.org. VoiceXML defines an audio interface through which users may interact with Web content, similar to the manner in which the Hypertext Markup Language (“HTML”) specifies the visual presentation of such content. In this regard, VoiceXML includes intrinsic constructs for tasks such as dialogue flow, grammars, call transfers, and embedding audio files.
Unfortunately, the VoiceXML standard generally contemplates that VoiceXML-compliant voice browsers interact exclusively with Web content that is in a special VoiceXML format. This has limited the utility of existing VoiceXML-compliant voice browsers, since a relatively small percentage of Web sites include content that is formatted in accordance with VoiceXML.
In addition to the large number of HTML-based Web sites, Web sites serving content conforming to standards applicable to particular types of user devices are becoming increasingly prevalent. For example, the Wireless Markup Language (“WML”) of the Wireless Application Protocol (“WAP”) (see, e.g., http://www.wapforum.org/) provides a standard for developing content applicable to wireless devices such as mobile telephones, pagers, and personal digital assistants. Some other standards for Web content include the Handheld Device Markup Language (“HDML”), and the relatively new Japanese standard Compact HTML.