The present invention is directed to speech recognition. It is directed particularly to those parts of speech-recognition systems used in recognizing patterns in data-reduced versions of the received speech.
Most systems for recognizing speech employ some means of reducing the data in the raw speech to representations of the speech that include less than all of the data that would be included in a straight digitization of the speech-signal input but that still contain most if not all of the data needed to identify the meaning intended by the speaker. In development, or "training" of the speech-recognition system, the task is to identify the patterns in the reduced-data representations that are characteristic of speech elements, such as words or phrases. Of course, the sounds made by different speakers uttering the same phrases are different, and there are other sources of ambiguity, such as noise and the inaccuracy of the modeling process. Accordingly, routines are used to assign likelihoods to various mathematical combinations in the elements of the reduced-data representation of the speech, and various hypotheses are tested to determine which one of a number of possible speech elements is most likely the one currently being spoken.
The processes for performing these operations tend to be computation intensive. The likelihoods must be determined for large numbers of speech elements, and the limitation on computation imposed by requirements of, for instance, real-time operation limit the sensitivity of the pattern-recognition algorithm that can be employed.
It is accordingly an object of the present invention to increase the computational time that can be dedicated to recognition of a given pattern but to do so without increasing the time required for the total speech-recognition process. It is a further object to improve the speech-recognition process.