The present invention relates generally to a computer system which handles incoming telephone calls and allows callers to access information via the computer system without the need for a live operator. The invention relates more specifically to a voice processing system, method and computer program product which allows the caller to access Internet World Wide Web pages by using only the caller""s telephone.
In the past couple of years there has been an explosive growth in the use of the globally-linked network of computers known as the Internet, and in particular of the WorldWide Web (WWW), which is one of the facilities provided on top of the Internet. The WWW comprises many pages or files of information, distributed across many different server computer systems. Information stored on such pages can be, for example, details of a company""s organization, contact data, product data and company news. This information can be presented to the user""s computer system (xe2x80x9cclient computer systemxe2x80x9d) using a combination of text, graphics, audio data and video data. Each page is identified by a Universal Resource Locator (URL). The URL denotes both the server machine, and the particular file or page on that machine. There may be many pages or URLs resident on a single server.
In order to use the WWW, a client computer system runs a piece of software known as a graphical Web browser, such as WebExplorer (provided as part of the OS/2 operating system from IBM Corporation), or the Navigator program available from Netscape Communications Corporation. xe2x80x9cWebExplorerxe2x80x9d, xe2x80x9cOS/2xe2x80x9d and xe2x80x9cIBMxe2x80x9d are trademarks of the International Business Machines Corporation, while xe2x80x9cNavigatorxe2x80x9d and xe2x80x9cNetscapexe2x80x9d are trademarks of the Netscape Communications Corporation. The client computer system interacts with the browser to select a particular URL, which in turn causes the browser to send a request for that URL or page to the server identified in the URL. Typically the server responds to the request by retrieving the requested page, and transmitting the data for that page back to the requesting client computer system (the client/server interaction is performed in accordance with the hypertext transport protocol (xe2x80x9cHTTPxe2x80x9d)). This page is then displayed to the user on the client screen. The client may also cause the server to launch an application, for example to search for WWW pages relating to particular topics.
Most WWW pages are formatted in accordance with a computer program written in a language known as HTML (hypertext mark-up language). This program contains the data to be displayed via the client""s graphical browser as well as formatting commands which tell the browser how to display the data. Thus a typical Web page includes text together with embedded formatting commands, referred to as tags, which can be used to control the font size, the font style (for example, whether italic or bold), how to lay-out the text, and so on. A Web browser xe2x80x9cparsesxe2x80x9d the HTML script in order to display the text in accordance with the specified format. HTML tags are also used to indicate how graphics, audio and video are manifested to the user via the client""s browser.
Most Web pages also contain one or more references to other Web pages, which need not be on the same server as the original page. Such references may generally be activated by the user selecting particular locations on the screen, typically by (double) clicking a mouse control button. These references or locations are known as hyperlinks, and are typically flagged by the browser in a particular manner (for example, any text associated with a hyperlink may be in a different colour). If a user selects the hyperlink, then the referenced page is retrieved and replaces the currently displayed page.
Further information about HTML and the WWW can be found in xe2x80x9cWorld Wide Web and HTMLxe2x80x9d by Douglas McArthur, p18-26 in Dr Dobbs Journal, December 1994, and in xe2x80x9cThe HTML SourceBookxe2x80x9d by Ian Graham, (John Wiley, New York, 1995).
Another common way of allowing people to automatically access information is allowing users to use their telephones to call-in to a company""s voice processing system (VPS) to obtain information in audio-only form (without the need for a computer). The VPS automatically handles the call and presents the caller with a menu of possible information which the caller can access, all under control of the central control system""s computer. An example of such a VPS is the IBM AIX DirectTalk/6000 software package running on IBM""s RISC System/6000 computer system (xe2x80x9cAIX DirectTalk/6000xe2x80x9d and xe2x80x9cRISC System/6000xe2x80x9d are all trademarks of the International Business Machines corporation).
If suppliers of information over the WWW were to also supply the same information via telephone-based systems, the reach of the information would be greatly expanded to people who do not have computers and instead call the supplier on a standard telephone to obtain the information. A known xe2x80x9cvoice browserxe2x80x9d system developed by NetPhonic Communications, Inc., called xe2x80x9cWeb On Callxe2x80x9d provides this ability (xe2x80x9cNetPhonic Communicationsxe2x80x9d and xe2x80x9cWeb On Callxe2x80x9d are trademarks of NetPhonic Communication, Inc.).
With xe2x80x9cWeb On Callxe2x80x9d, a telephone-based VPS automatically answers incoming calls by running a voice application which instructs the VPS as to how to deal with the incoming calls and provide the caller with the appropriate information. This voice application accesses Web pages which are provided as data for the voice application, thus allowing the caller to have access to information contained in the Web pages.
For example, the HTML program of a Web page is modified by the programmer so that some basic voice application commands are added into the HTML program as extra control tags. These control tags are ignored by a graphical browser when a user is accessing the Web page via a client computer system. However, when a user does not have a computer system, and instead calls-in using a telephone, these control tags have meaning to the xe2x80x9cvoice browserxe2x80x9d voice application which processes the control tags and is controlled accordingly. For example, the control tags tell the voice application whether to read or ignore certain text contained in the HTML program.
When text is to be read to the caller, a pre-recorded voice segment is retrieved from memory, under the control of the voice browser""s voice application, and presented to the caller.
With the xe2x80x9cWeb On Callxe2x80x9d software product, the voice application is provided separately from the HTML Web page. The HTML Web page has only basic commands included therein, such as commands to fetch a voice segment. That is, all of the voice application""s structural intelligence is in the voice application itself, with only basic data provided in the HTML document.
If the information supplier wishes to change the information structure both the voice application and the HTML page have to be altered separately. For example, if the information supplier wishes to add another user-selectable command, such as xe2x80x9cpress the xe2x80x988xe2x80x99 key on your telephone keypad if you wish to repeat the information unit which you have just heardxe2x80x9d, it is necessary to edit both the voice application and the HTML page to make this change. The structure of the voice application would have to be changed to provide the functionality of allowing the caller to hear the previously supplied information again. The HTML application would have to be changed to include the data relating to the words the caller will hear to inform the caller that the xe2x80x9c8xe2x80x9d key is the key which the caller must press in order to execute this new command. In addition, there is the added problem that each Web page made accessible to the caller would have to be so-modified.
Another difficulty with the prior art is that the same voice application commands must be shared by each Web page accessible through the voice browser. For example, the same three user-commands, e.g., xe2x80x9cPress xe2x80x981xe2x80x99 to . . . ,xe2x80x9d must be used in providing every Web page to a caller. This is again due to the separation of the voice processing system commands (in the voice browser""s voice application) and the voice processing system data (in the HTML document).
According to the present invention, the above problems are solved by integrating more closely the voice application and the HTML pages together. The voice application commands as well as data are contained within the HTML Web pages. A voice browser ignores all HTML-tag information written for a graphical Web browser and a graphical Web browser ignores all HTML-tag information written for the voice browser. This way, the same HTML document is accessible to both computer users (via a graphical Web browser) and to telephone callers (via a voice browser).
With the present invention, it becomes very easy to maintain synchronism between the voice application commands and the voice application data, since both are contained in the same place, i.e., in the HTML Web page. To make a change, the programmer need only access the HTML Web page and modify both the voice application commands and data at the same time. There is no need to also access the voice browser""s internal voice application and make changes to it as well. Further, different caller-initiated commands can be easily assigned to different Web pages.
According to the invention, a voice processing system, method and computer program product (stored on a computer-readable medium such as hard disk, floppy disk or semiconductor memory) therefor, allows telephone callers without computers to access World Wide Web pages from the Internet. Usual Hyper-Text Mark-Up language (HTML) information is interspersed with special HTML tags including the commands and data for forming a voice application, which, when run on the voice processing system, provides a voice browser for allowing telephone callers to access Web pages. Preferably, the special HTML tags include designations of the telephone keys a caller must press in order to actuate commands while accessing the HTML documents. The voice application tags are provided together with the remainder of the HTML document, thus facilitating editing of the combined data. This helps to keep the graphical browser and voice browser versions of the same data set synchronized.
A voice processing system for allowing telephone callers to access Hyper-Text Mark-Up Language (HTML) documents without the use of a computer, said voice processing system comprising:
caller input/output port connected to a telephone network of telephone callers;
processing unit which runs a voice application; and
data communications network input/output port connected to a data communications network accessing HTML documents;
wherein at least one of said HTML documents has voice application HTML tags inserted therein, said tags providing the commands and data required to form said voice application.
A method of allowing a telephone caller to obtain access to World Wide Web (WWW) pages comprising steps of:
obtaining information from an incoming call;
retrieving a WWW home page corresponding to said obtained information; and
running a voice application based on said retrieved WWW home page to interact with said caller to provide the WWW page data to the caller via the telephone line;
wherein said WWW home page has voice application tags inserted therein, said tags providing the commands and data required to form said voice application.
A computer program product stored on a computer-readable storage medium, said product comprising:
Hyper-Text Mark-Up language (HTML) information for instructing the display of data on a graphical Web browser; and
voice application information interspersed amongst said HTML information as HTML tags containing the commands and data required to form a voice application for use in allowing a telephone caller to access HTML documents.
In a voice processing system, an apparatus for converting a Hyper-Text Mark-Up language (HTML) document into a voice application for allowing telephone callers to access World Wide Web pages, said apparatus comprising:
receiving means for receiving an HTML document, said HTML document having voice application information interspersed amongst other HTML information as HTML tags containing the commands and data required to form said voice application;
converting means for converting said HTML document into a voice application by interpreting said HTML tags.