The present invention relates generally to telephone switching equipment. More particularly the invention relates to a voice dialing server that attaches to the telephone branch exchange equipment to provide voice dialing services without the need to extensively modify the branch exchange equipment. The preferred system plugs into one or more unused extensions of the branch exchange system to provide voice dialing services for multiple users of the system. Each user may have his or her own dictionary of names and phone numbers. The system integrates with the existing branch exchange network, using the existing voice and control channels to cause the existing branch exchange system to perform the necessary switching operations.
Voice dialing promises to make telephones easier to use, by allowing the user to simply speak a name and then have the voice dialing system look up the telephone number of the named party and automatically place the call. In the cellular telephone market, rudimentary voice dialing systems have been experimented with to provide hands-free operation. The primary technological focus in the cellular telephone market has been on how to overcome the high ambient noise level present in the cellular telephone environment, particularly in car phone applications. There has also been some work in developing voice dialing units for the home. These units typically connect between the telephone and the outside telephone line. A primary technological focus of those units has been on how to overcome the presence of the dial tone when the user lifts the handset to use the voice dialer.
While voice dialing has made some inroads, particularly in the applications discussed above, voice dialing has yet to be incorporated into more complex telephone systems such as private branch exchange switching systems (PBX systems). There are a number of reasons for this. First, voice recognition is a challenging problem and current technology does not provide suitable recognition accuracy in an economical configuration. For example, the complex Hidden Markov Model-based systems employed by state-of-the-art speech recognizers (as in dictation transcription systems) require lots of memory and computational power.
Second, in the voice dialing application, the voice recognition problem is compounded where the system must be adapted for use by a large number of users. The need to respond to the spoken commands of a large number of users makes the voice dialing problem far more difficult than it is for simple voice dialing systems designed for home use.
Third, it is not a simple matter to integrate voice dialing into a complex telephone switching network. Modern-day telephone switching networks employ an intricate labyrinth of digital control signals that effect various switching functions (e.g. placing a call on hold, transferring a call, initiating a conference call, reassigning an extension to a different location and so forth). Simple voice dialing systems of the type employed in cellular phone applications or home dialing applications will not work in this more complex environment.
Finally, office PBX equipment is expensive and difficult to replace without disrupting day-to-day office functions. Thus many businesses that would benefit from voice dialing services, were such equipment available, simply cannot afford the cost and down-time required to replace that equipment with newer equipment providing voice dialing capabilities.
Thus, while the desirability of providing voice dialing in office systems is readily appreciated, current technology does not provide the means to accomplish it.
The present invention provides a voice dialing server for coupling to a branch exchange telephone system of the type that provides call switching among a plurality of telephone extension ports. The system is designed for plug-compatible connection to the existing telephone system without the need for modifying the system extensively. The voice dialing server has an interface for connection to at least one of the telephone extension ports of the existing telephone system. The interface supports transmission of voice signals and telephone system control information.
The voice dialing server also includes a speech processing module coupled to the interface for providing the following services. The speech processing module answers calls placed to the voice dialing server by users of the system. It processes speech input from the user, corresponding to a selected party to be called; and it looks up the telephone number of the selected party.
The voice dialing system also includes a branch exchange control module that is coupled to the interface and to the speech processing module. The control module issues control information to the telephone system, causing the telephone system to connect the user's extension to an outside line while dialing the phone number of the selected party. The preferred embodiment causes the extension that has been assigned to the interface to be connected to a second telephone port on the system. The second port can be another extension or an outside line. Then the call is placed via the second port and the user's extension is then attached to the second port. In this way the user is placed in communication with the selected party.
The system integrates fully with the existing branch exchange telephone system. Thus the invention can be readily added to an existing telephone system, simply by plugging it into an unused extension port on the system. To use the system the user simply dials the extension assigned to the voice dialing server and follows the voice prompts issued by the server. The system is preferably implemented in a multitasking environment that allows multiple threads to run concurrently. Thus multiple users may use the system simultaneously. The system is capable of providing different phone directories for different users, and these may be automatically associated with the users' telephone extension. The system is able to determine the extension of the user. By determining the user's extension the voice dialing server automatically uses the phone number dictionary created by the user at that extension. Alteratively, the user can override the determined extension by supplying a different extension, thereby causing a different phone number dictionary to be used.
Although well integrated into the existing telephone system architecture, the invention can also be used by callers outside the system to reach persons inside the system or to look up numbers from the telephone book. For example, a user calling from home may connect to the voice dialing server by specifying the server's extension. Then, the user may enter his or her office telephone extension number, thereby telling the voice dialing server that the phone number dictionary assigned to the office extension should be used. Thereafter, the user calling from home can use his or her office telephone number directory just as if the user were from the office.
The voice dialing server uses very fast and yet remarkably accurate voice recognition technology based on reliably detected phoneme similarity regions. The preferred embodiment uses a multistage word recognizer that compactly represents speech in terms of high phoneme similarity values. This is a departure from conventional techniques that determine similarity based on a frame-by-frame alignment. The preferred embodiment uses a word recognizer that preserves only the interesting regions of high phoneme similarity or features. A word recognizer is used to narrow the search so that the subsequent fine match stage is able to perform its task more quickly. The word recognizer and fine match stages share the initial representation of speech as a sequence of multiple phoneme similarity values. By representing speech as features at a lower data rate in the initial stage of recognition, the complexity of the matching procedure is greatly reduced.
For a more complete understanding of the invention, its objects and advantages, reference may be had to the following specification and to the accompanying drawings.