In pattern recognition, incoming signals are digitized, and a sequence of feature vectors are formed. These feature vectors are then compared to templates of candidate patterns to be identified in the signal. E.g., in the case of speech recognition, the candidate patterns can represent names in a phonebook.
However, pattern recognition such as speech recognition is computationally demanding. In many cases, for example when implemented in embedded devices, due to the limited amount of memory and computational power there is a need to reduce the complexity of the algorithm.
The computational complexity depends on several factors: the sampling rate of feature vectors, the number of candidate model templates, and the feature vector dimension. Reducing any of these results in faster recognition that can be run in reasonable time on a certain processor, but this also gives poorer recognition accuracy.
Conventional complexity reduction of pattern recognizers, such as speech recognizers, has been addressed by at least the following prior art techniques:
1. Feature vector down sampling
2. Clustering of the model templates
3. Reduction of the feature vector dimension
The second technique first clusters the acoustic space off line. Then, during decoding, a quick search among the clusters is performed first, and then only the members of the best matching cluster are evaluated.
An example of such off-line clustering is described in “Fast decoding techniques for practical real-time speech recognition systems”, Suontausta J, Hakkinen J, and Viikki O., proc. IEEE workshop on Automatic Speech Recognition and Understanding, Keystone, Colo., Dec. 1999.
According to this method, for a given feature vector, a codebook with a given number of code vectors is introduced, and a sub-set of Gaussian densities to be evaluated is assigned to each code vector. For each feature vector the closest code vector is determined, and its corresponding density sub-set is then used for distortion computations.
This method provides computational savings for similar classification performance, but requires additional parameter data. Due to the required code vector search step and the fact that, usually, the dimension of the feature space is high, the computational savings can be significantly reduced.