The present invention pertains to a distributed voice web architecture. More particularly, the present invention relates to a method and apparatus for providing one or more users with voice access to various voice content sites on a network.
The World Wide Web (xe2x80x9cthe Webxe2x80x9d) is a global, Internet-based, hypermedia resource used by millions of people every day for many purposes, such as entertainment, research, shopping, banking, and travel reservations, to name just a few. The hyperlink functionality of the Web allows people to quickly and easily move between related pieces of information, without regard to the fact that these pieces of information may be located on separate computer systems, which may be physically distant from each other. Rapid advances have been made in Internet technology and Web-related technology in particular, to make the Web an increasingly valuable resource.
Another rapidly advancing technology is speech technology, which includes automatic speech recognition. Automatic speech recognition facilitates interactions between humans and machines. Like the Web, therefore, speech technology can be used for, among other things, facilitating people""s access to information and services. A few speech-based services exist today. However, these services are generally implemented separately from each other, typically on a small scale, and using different proprietary technologies, many of which are incompatible with each other.
The present invention includes a method and apparatus in which speech of a user is received and endpointed locally. The endpointed speech of the user is transmitted to a remote site via a wide area network for speech recognition. Remotely generated prompts that have been transmitted over the wide area network are received and played to the user.
Another aspect of the present invention is a method and apparatus in which endpointed speech of a user that has been transmitted remotely over a wide area network by a remote device is received. The speech is recognized locally, and a prompt is generated in response to the speech. The prompt is then transmitted to the remote device over the wide area network.
Another aspect of the present invention is a speech-enabled distributed processing system. The processing system includes a gateway and a remote voice content site. The gateway is coupled to receive speech from a user via a voice interface and performs endpointing of the speech. The gateway transmits the endpointed speech to the remote voice content site over a network, receives prompts from the remote voice content site via the network, and plays the prompts to the user. The voice content site receives results of the endpointing via the network and performs speech recognition on the results. The voice content site also generates prompts and provides the prompts to the gateway via the first network, to be played to the user. The voice content site also provides control messages to the gateway to cause the gateway to access any of multiple remote voice content sites on the network in response to a spoken selection by the user. The voice content site may include a speech application, such as a voice browser, which generates the prompts and the control messages.
Other features of the present invention will be apparent from the accompanying drawings and from the detailed description which follows.