The amount of information available over communication networks is large and growing at a fast rate. The most popular of such networks is the Internet, which is a network of linked computers around the world. Much of the popularity of the Internet may be attributed to the World Wide Web (WWW) portion of the Internet. The WWW is a portion of the Internet in which information is typically passed between server computers and client computers using the Hypertext Transfer Protocol (HTTP). A server stores information and serves (i.e. sends) the information to a client in response to a request from the client. The clients execute computer software programs, often called browsers, which aid in the requesting and displaying of information. Examples of WWW browsers are Netscape Navigator, available from Netscape Communications, Inc., and the Internet Explorer, available from Microsoft Corp.
Servers, and the information stored therein, are identified through Uniform Resource Locators (URL). URL's are described in detail in Berners-Lee, T., et al., Uniform Resource Locators, RFC 1738, Network Working Group, 1994, which is incorporated herein by reference. For example, the URL http://www.hostname.com/document1.html identifies the document "document1.html" at host server "www.hostname.com". Thus, a request for information from a host server by a client generally includes a URL. The information passed from a server to a client is generally called a document. Such documents are generally defined in terms of a document language, such as Hypertext Markup Language (HTML). Upon request from a client, a server sends an HTML document to the client. HTML documents contain information that is interpreted by the browser so that a representation can be shown to a user at a computer display screen. An HTML document may contain information such as text, logical structure commands, hypertext links, and user input commands. If the user selects (for example by a mouse click) a hypertext link from the display, the browser will request another document from a server.
Currently, most WWW browsers are based upon textual and graphical user interfaces. Thus, documents are presented as images on a computer screen. Such images include, for example, text, graphics, hypertext links, and user input dialog boxes. Most user interaction with the WWW is through a graphical user interface. Although audio data is capable of being received and played back at a user computer (e.g. a .wav or .au file), such receipt of audio data is secondary to the graphical interface of the WWW. Thus, with most WWW browsers, audio data may be sent as a result of a user request, but there is no means for a user to interact with the WWW using an audio interface.
An audio browsing system is disclosed in U.S. patent application Ser. No. 08/635,601, assigned to AT&T Corp. and entitled Method and Apparatus for Information Retrieval Using Audio Interface, filed on Apr. 22, 1996, incorporated herein by reference (hereinafter referred to as the "AT&T audio browser patent"). The disclosed audio browsing system allows a user to access documents on a server computer connected to the Internet using an audio interface device.
In one embodiment disclosed in the AT&T audio browser patent, an audio interface device accesses a centralized audio browser that is executed on an audio browsing adjunct. The audio browser receives documents from server computers that can be coupled to the Internet. The documents may include specialized instructions that enable them to be used with the audio interface device. The specialized instructions typically are similar to HTML. The specialized instructions may cause the browser to generate audio output from written text, or accept an input from the user through DTMF tones or automated speech recognition.
A problem that arises with an audio browsing system that includes a centralized browser is that the input of user data often requires a complex sequence of events involving the user and the browser. These events include, for example: a) prompting the user for input; b) enumerating the input choices; c) prompting the user for additional input; and d) informing the user that a previous input was wrong or inconsistent. We have found that it is desirable to program and customize the centralized browser in order to define the allowed sequences of events that can occur when the user interacts with the browser. However, when programming and customizing the browser, it is important to minimize certain performance problems that result from both inadvertently erroneous and malicious programming.
One such problem is that a browser that has been customized can become unresponsive if the customization contains, for example, an infinite loop. In addition to reducing the performance of the browser, to the detriment of other activity being performed by the browser, such a loop could allow a telephone call to extend over more time, disadvantageously adding to the cost of the call while at the same time potentially denying other callers access to the browser.
Another problem, known as a "denial of service" attack, is easier for the attacker to execute if the browser is customized in a way that allows a caller to keep the call connected without offering any input.
Some of these performance problems are less important in the context of non-centralized browsers, because non-centralized browsers that have been poorly customized typically affect only the computer that is executing the browser and the computer's telephone lines, and therefore programming errors are effectively quarantined.
However, in the centralized browser embodiment of the audio browsing system disclosed in the AT&T audio browser patent, and in any centralized browser, when the audio browsing adjunct that is executing the centralized browser incurs performance problems, the negative effects of the problems are exacerbated. In an audio browsing system, multiple users access the same audio browsing adjunct through multiple audio interface devices and thus many users are negatively affected when the audio browsing adjunct incurs performance problems. Therefore, it is desirable in an audio browsing system to minimize performance problems.
Another problem with most known browsers is that data entered on the browser at the client computer is typically sent to the server where verification and validation of the data is performed. For example, if a user enters data through a keyboard into a computerized fill-in form on a browser, that data is typically sent to the Internet server where it is verified that the form was properly filled out (i.e., all required information has been entered, the required number of digits have been entered, etc.). If the form was not properly filled out, the server typically sends an error message to the client, and the user will attempt to correct the errors.
However, in an audio browser system, frequently the data entered by the user is in the form of speech. The speech is converted to voice data or voice files using speech recognition. However, using speech recognition to obtain voice data is not as accurate as obtaining data through entry via a keyboard. Therefore, even more verification and validation of data when it is entered using speech recognition is required. Further, voice files converted from speech are typically large relative to data entered from a keyboard, and this makes it difficult to frequently send voice files from the audio browsing adjunct to the Internet server. Therefore, it is desirable to do as much verification and validation as possible of entered data at the browser in an audio browser system so that the number of times that the voice data is sent to the Internet server is minimized.
Based on the foregoing, there is a need for a audio browser system in which performance problems of the audio browsing adjunct executing the browser are minimized, and in which entered data is typically verified and validated at the browser instead of at the Internet server.