The present invention is related in general to enhanced telephony services, and in particular, to a networked voice-activated dialing and call-completion system.
With the availability of speech recognition systems that can provide real-time services for large-vocabulary continuous-speech applications in a telephony bandwidth, several applications have become possible. One such application is a voice-activated dialer for corporate-wide service. In a typical voice-activated dialer, a corporation or a large association may establish for itself a voice-activated call-routing system whereby a calling party connected to the voice-activated dialer can speak the name of a called party. A speech recognizer connected to the voice-activated dialer recognizes the speech and matches a called party. Thereafter, a call-competition device coupled to the voice-activated dialer may complete the call, establishing a voice path between the calling party and the caller party.
The number of corporate-wide voice dialer systems is expected to increase in number. But there is no system that can take advantage of the specific services offered by each such system to provide a more advanced service to a calling or a called party. Accordingly, there is a need for advancement in the art.
As used in this application, the word xe2x80x9ccallxe2x80x9d or the phrase xe2x80x9ctelephone callxe2x80x9d includes any voice, text, or video call, including a mode of information transfer that combines any of these methods. Further, though for simplicity a description of a traditional telephone call is made herein, it should be understood that any multimedia communication method may be substituted for the method of communication described. Additionally, a xe2x80x9cpartyxe2x80x9d is a natural person or a computer program; and a xe2x80x9ccallingxe2x80x9d party should be understood as a party that initiates an attempt to communicate with a xe2x80x9ccalledxe2x80x9d party.
In an aspect, the present disclosure is directed toward a networked system of voice-activated dialers (VAD). These VADs are, for the sake of convenience, termed xe2x80x9clocalxe2x80x9d and xe2x80x9cremote.xe2x80x9d The words xe2x80x9clocalxe2x80x9d and xe2x80x9cremotexe2x80x9d may indicate VADs that are geographically apart from each other, VADs that are owned or controlled by different entities, or VADs that perform different functions.
Each VAD has a speech recognizer specifically tuned to service a directory of names containing a pre-determined group of subscribers. The speech recognizer may be speaker-dependent, speaker-trained (e.g., template-based), or speaker-independent (for example, phoneme-based), and may implement techniques such as Hidden Markov Model or neural networks.
Each VAD is networked with at least one other VAD via a communication network such as a private Ethernet or a public network such as the Internet. Further, each VAD is communicatively coupled to a telecommunication network such as the Public Switched Telephone Network (PSTN). Each VAD may be identified by a unique method of addressing; the method of addressing designed to enable identification easy. In an aspect, each VAD publishes its address to a centralized server to resolve addressing issues.
When a calling party wishes to make a telephone call to a called party, the calling party dials a pre-designated telephone number to connect with a first VAD. The calling party then utters the name of the called party, and additionally specifies a second VAD or other information that could be used to find a second VAD by means of which the called party may be reached. The calling party""s speech is parsed, and interpreted by a first speech recognizer coupled to the first VAD. Subsequently, the first recognizer or other software or hardware devices coupled to the first recognizer identifies the second VAD. Thereafter, at least a portion of the calling party""s spoken utterance is transmitted to the second VAD.
The second VAD receives the calling party""s spoken utterance. Subsequently, a second speech recognizer coupled to the second VAD matches the calling party""s spoken utterance with a name and a telephone number in a database coupled to the second VAD. The second VAD then transmits to the first VAD information suitable for call completion. Thereafter the first VAD completes a telephone call between the calling party and the called party by instructing an appropriate switching element in the PSTN, or by using traditional call completion methods.
In another aspect, the first VAD transmits additional information identifying the calling party to the second VAD. In an embodiment, the first VAD may transmit the calling party""s location. The second VAD receives this additional information and may use it in determining the appropriate called party. In a further aspect, the second VAD may transmit one or more matches based on the spoken utterance by the calling party, allowing the calling party to determine the appropriate called party.
In a further aspect, the present invention includes a method of using signaling system No. 7 (SS7) to complete the calling party""s call. Once the called party is identified by the second VAD, the called party""s information is routed back to the first VAD, whereupon an SS7 packet is transmitted to the called party station to determine if the called party station is available. If the called party station is busy or otherwise not available, the calling party is provided with an option to leave a voice message at a designated station.
In case the called party cannot be located at the second VAD, an alternative method to locate the called party such as attempting to locate the calling party at a third VAD may be accomplished by the first VAD. Alternatively, the first VAD or the calling party may terminate the call.