1. Field of the Invention
The present invention relates generally to a discriminant or identification-function calculator, a discriminant- or identification-function calculating method, a classification or identification unit, a classification or identification method, and a speech recognition system. More particularly, the invention relates to an identification-function calculator, an identification-function calculating method, an identification unit, an identification method, and a speech recognition system, all of which are suitable for performing pattern recognition, for example, speech recognition and image recognition.
2. Description of the Related Art
For example, pattern recognition, such as speech recognition and image recognition, is performed in such a manner that feature or characteristic vectors are extracted from input patterns, and the values of the discriminant or identification functions are calculated using the characteristic vectors as input values. The identification functions are used for classifying the input characteristic vectors under a predetermined number of classes. The number of functions are equal to or greater than the number of classes. The class corresponding to the greatest value of the identification functions with respect to the input characteristic vectors is output as the recognition result (classification result).
For performing pattern recognition, high recognition performance is desirably obtained regardless of a change in the state of variation factors. Hence, hitherto, the learning of the identification functions for performing pattern recognition are carried out by use of learning samples including a large number of variations. More specifically, for performing, for example, speech recognition, learning is conducted by use of speech data including a large number of variations as learning samples so as to obtain identification functions (for example, phoneme-discriminant functions when phoneme recognition is carried out) sufficiently resisting a change in the state of speech-variation factors, such as the sound-making environments, the speakers, the characteristics of the input apparatus systems (for example, the characteristics of the microphones and the analog-to-digital convertors for converting the outputs from the microphones). The aforementioned learning method is described in, for example, Speech Recognition with Probabilistic Models by Seiichi Nakagawa, the Institute of Electronics, Information and Communication Engineers, and Context-Dependent Phonetic Hidden Markov Models for Speaker-Independent Continuous Speech Recognition by KAI-FU, IEEE Transaction on ASSP VOL.38, NO.4,April 1990.
Referring to FIG. 7 illustrating the configuration of a typical conventional identification-function calculator, a great number of learning samples are input into an identification-function calculator 51 in which an identification function, i.e., a parameter representing (forming) the identification function, is determined based on the learning samples.
However, satisfactory recognition performance cannot be always ensured by use of the identification function obtained through the aforedescribed learning method. For better recognition performance, a method is available, for example, for adapting the identification functions to the states of speech variation factors during recognition. A method is available for adapting, for example, the phoneme-discriminant functions, to the speaker is disclosed in, various technical literature, such as A Study on Speaker Adaptation of the Parameters of Continuous Density Hidden Markov Models by Chin- Hui Lee, et al., IEEE, Transaction on signal processing, VOL.39, NO.4, 1991, and Fast Speaker Adaptation for Speech Recognition Systems by F. Class, et al., Proceedings of IEEE ICASSP, pp. 133-136, 1990.
Hitherto, learning is carried out in such a manner that the identification functions are determined regardless of the aforementioned adaptation method employed during recognition. Namely, the identification functions are determined based on the standards in which the highest performance can be offered when speech adaptation is not made. Consequently, optimum adaptation cannot be always made during recognition even by use of the identification functions determined by the above-described technique. This makes it difficult to significantly improve the identification or accuracy (recognition accuracy) even though adaptations are made.