The present invention relates to voice recognition systems and methods involving training to identify an instruction corresponding to a voice signal.
Conventional voice recognition systems are categorized generally as either speaker independent systems which are intended to recognize instructions corresponding to voice signals without training of the system to identify such instructions, and speaker dependent systems which employ such training. In the case of speaker dependent systems, voice samples are supplied to the system in response to a request from the system that a certain word or groups of words be spoken. The system processes the received voice signal to produce voice recognition data for future use in identifying an instruction corresponding to the same word or words expressed by the voice signal. In general, the greater the number of such samples provided to the system, the more reliably it operates subsequently to identify an instruction corresponding to a particular voice signal.
The training periods required for operating such speaker dependent systems are typically quite lengthy and complex. Users often find the training procedures tedious and wasteful.
Training is normally conducted in a single session on a given day. During the session, the user of the system provides a large number of voice samples to the system so that it can "train" by matching the received voice samples to data indicating the corresponding instruction. However, one's voice changes from day to day. For example, illness or stress can cause one's voice to change over the course of time. Consequently, the voice samples provided during the usual single training session might not be fairly representative of the speaker's voice under different conditions.