The Internet has created an entirely new way for people and businesses to interact and share data. However, traditional Internet interaction involves a user at a computer console accessing stored or generated text, image (still and video) and audio data. Typically, data on the Internet are accessed via a so-called Web browser (e.g., AOL's Netscape™ browser). A user connects to the Internet using the browser and provides an address (a URL—uniform resource locator) for the data which the user wishes to access. If the address of the data is valid, the data are provided to the user via the browser. Specifically, images and text represented by the data are displayed by the browser and any sounds represented by the data are reproduced over the user's computer's sound system.
It is desirable, however, to provide a broader range of access mechanisms and approaches to data access on various types of networks. There may be two perspectives in terms of information services. One is the user perspective. For example, it may be desirable for a user to be able to access a typical and general Web site using a telephone (cellular or otherwise) and other devices which have audio capabilities but may have limited or no visual display capabilities. As another example, a user may wish to have requested information delivered simultaneously to both a phone (via voice) and a facsimile machine.
A service provider may also desire a broader range of information delivery mechanisms on various types of networks. For example, an e-business provider may wish to alert its consumers about various upcoming events. Conventionally, such alerts may be sent through a fixed channel (e.g., via electronic mail—e-mail). To be more effective in marketing and sales, the e-business provider may desire to have the capability of automatically reaching its consumers via multiple channels (e.g., via both e-mail and phone).
Some partial and limited solutions do exist for the problem from both the user's perspective and from the information provider's perspective. But there is no coherent solution for both. In addition, each of the current solutions has drawbacks and disadvantages. Present options include the following:    1. Traditional Interactive Voice Response (IVR) system. These systems from telephony vendors such as Lucent and Nortel require significant knowledge of telephony. Developers must understand telephony concepts such as inbound and outbound calls, call control capabilities, automatic call distribution (ACD) hardware, public service telephone network (PSTN) protocols, and the intricacies of the private branch exchange (PBX) used at that site. All these technologies are considered part of the “traditional” telecommunication industry and have a steep learning curve. This learning curve is maintained by the profusion of vendor-specific hardware and software that sustains a special vocabulary that is unique to the telecommunications industry. Although IVR solutions provide hooks to mainframe databases and traditional relational databases, the telephony orientation of the development tools makes it very difficult for a typical web developer to create voice solutions. Further, incorporating speech recognition into these solutions provides an additional challenge. Current host-based speech recognition engines require developers to understand many concepts that are unique to speech recognition applications. These include concepts such as grammars, n-best processing, Hidden Markov Chains, end-pointing, and special languages to build and compile grammars to be used in these applications.    2. Embedded special tags to enable parts of a website. There are some companies (such as Vocalis) that provide this capability. Special tags inserted into a web page enable their speech recognition engines to recognize the tagged key words. These solutions require users to change their sites, re-test, and re-deploy them, resulting in significant costs.    3. Verizon offers alert services through wireless phone service to its customers. A customer specifies categories of interest (e.g., stock price) to be alerted about and the conditions under which the alerts are sent. When an alert is sent, it is sent to the customer's wireless phone in the form of text displayed on the screen of the phone. In this case, an alert is sent via a fix channel, i.e., wireless network, in a fixed modality, i.e., text modality.
All existing approaches require significant amounts of special hardware 5 with very special hosting needs.