This invention relates to speech recognition apparatus and methods.
In complex equipment having multiple functions it can be useful to be able to control the equipment by spoken commands. This is also useful where the user's hands are occupied with other tasks or where the user is disabled and is unable to use his hands to operate conventional mechanical switches and controls.
Programming of speech recognition apparatus is achieved by reading out a list of words or phrases to be entered into a reference vocabulary. The speech sounds are broken down into spectral components and stored as spectral-temporal word models or templates.
When an unknown word is subsequently spoken this is also broken down into its spectral components and these are compared with the reference vocabulary by means of a suitable algorithm such as the Hidden Semi-Markov Model. The reference vocabulary is preferably established by multiple repetitions of the same word in different circumstances and by different people. This introduces a spread or broadening of the word models so that there is a higher probability that when the same word is subsequently spoken it will be identified against that word model. However, it can result in overlap of similar word models leading to a greater probability of an incorrect identification.
The use of neural nets has also been proposed but these are not suitable for identification of continuous speech.
The ability to achieve accurate identification of spoken words is made more difficult in adverse circumstances such as with high background noise or when the speaker is subject to stress.