The present invention is directed to speech recognition, and more specifically to a speech recognition driven system with selectable speech models.
A number of biometric signatures have been utilized to identify a particular individual. For example, fingerprint, retina, iris, face and voice recognition technologies have utilized pattern recognition techniques to uniquely identify a particular individual. Face and voice recognition systems are particularly attractive as they are normally unobtrusive and are passive (i.e., they do not require electromagnetic illumination of the subject of interest). A number of face recognition systems are currently available (e.g., products are offered by Visionics, Viisage and Miros). Further, some vendors offer products that utilize multiple biometric signatures to uniquely identify a particular individual. For example, Dialog Communication Systems (DCS AG) has developed BioID(trademark) (a multimodal identification system that uses face, voice and lip movement to uniquely identify an individual).
As is well known to one of ordinary skill in the art, speech recognition is a field in computer science that deals with designing computer systems that can recognize spoken words. A number of speech recognition systems are currently available (e.g., products are offered by IBM, Dragon Systems, Learnout and Hauspie and Philips). Most of these systems modify a speech model, based on a user""s input, to enhance accuracy of the system. Traditionally, speech recognition systems have only been used in a few specialized situations due to their cost and limited functionality. For example, such systems have been implemented when a user is unable to use a keyboard to enter data because the user""s hands were disabled. Instead of typing commands, the user spoke into a microphone.
However, as the costs of these systems has continued to decrease and the performance of these systems has continued to increase, speech recognition systems are being used in a wider variety of applications (as an alternative to keyboards or other user interfaces). For example, speech actuated control systems have been implemented in motor vehicles to control various accessories within the motor vehicles.
A typical speech recognition system, that is implemented in a motor vehicle, includes voice processing circuitry and memory for storing data that represents command words (that are employed to control various vehicle accessories). In a typical system, a microprocessor is utilized to compare the user provided data (i.e., voice input) to stored speech models; to determine if a word match has occurred and provide a corresponding control output signal in such an event. The microprocessor has also normally controlled a plurality of motor vehicle accessories, e.g., a cellular telephone and a radio. Such systems have advantageously allowed a driver of the motor vehicle to maintain vigilance while driving the vehicle.
Acceptance of speech recognition as a primary interface for any multi-user system (e.g., an automobile), is dependent upon the recognition accuracy of the system. As mentioned above, a method for increasing speech recognition accuracy has been to implement systems, which adapt to a speaker. This has entailed storing a continuously updated version of a speech model for each word or subword in a given vocabulary. In this manner, the system adjusts to the speaking pattern of a given individual, thus increasing the probability for correct recognition. Unfortunately, such systems generally cannot be utilized by multiple users (unless the multiple users have nearly identical speech patterns).
As such, a system that provides multiple adaptable user specific speech models is desirable.
The present invention is directed to a method and system that provides a speech model based on a biometric signature. Initially, the speech recognition driven system receives a biometric signature from the user of the system. Based upon the received biometric signature, the system selects a speech model. The selected speech model is utilized to determine whether a voice input, provided by the user, corresponds to a speech selectable task that is recognized by the speech recognition driven system. When the voice input corresponds to the speech selectable task, the system causes the speech selectable task to be performed. In one embodiment, the biometric signature is an image of the user""s face. When face recognition technology is implemented, the image of the user""s face is utilized to select a speech model. In another embodiment, the system uses a default speech model when the system fails to recognize the biometric signature. In yet another embodiment, the system creates a new speech model when the system fails to recognize the biometric signature. In a different embodiment, the selected speech model is updated such that the system adapts to the speech pattern of the user. An advantage of the present invention is that when an individualized speech model is selected, the error rate of the speech recognition driven system is generally reduced.
These and other features, advantages and objects of the present invention will be further understood and appreciated by those skilled in the art by reference to the following specification, claims and appended drawings.