1. Field of the Invention
The present invention relates to voice recognition, and more particularly, to a method and an apparatus for discriminative estimation of parameters in a maximum a posteriori (MAP) speaker adaptation condition, and a voice recognition apparatus including the apparatus and a voice recognition method using the method.
2. Description of the Related Art
In MAP speaker adaptation, in order to convert a model so that it is appropriate to the voice of a new speaker, a prior density parameter which characterizes the central point of a model parameter and the change characteristic is of the parameter should be accurately estimated. Particularly in unsupervised/incremental MAP speaker adaptation, in an initial stage when less adaptation sentences are available, the performance of voice recognition can be dropped even lower than the performance thereof without a speaker adaptation function if initial prior density parameters are wrongly estimated.
In conventional speaker adaptation, the method of moments or empirical Bayes techniques are used to estimate a prior density parameter. These methods characterize statistically the variations of respective model parameters across different speakers. However, in order to estimate reliable prior density parameters using these methods, training sets on many speakers are required, and sufficient data for models of different speakers are required. In addition, since a model is converted by using the recognized result of a voice recognition in unsupervised/incremental speaker adaptation, a model is adapted to a wrong direction by incorrectly recognized results if there is no verification process.
MAP speaker adaptation is confronted with three key problems: how to define characteristics of prior distribution, how to estimate parameters of unobserved models, and how to estimate parameters of prior density. Many articles have been presented on what prior density functions to use and how to estimate parameters of the density functions. A plurality of articles have presented solutions on the estimation of parameters of unobserved models, and an invention which adapts model parameters of the speaker-independent Hidden Markov Model (HMM) has been granted a patent (U.S. Pat. No. 5,046,099).
Discriminative training methods were first applied to model training in the field of voice recognition (U.S. Pat. No. 5,606,644, U.S. Pat. No. 5,806,029), and later applied to the field of utterance verification (U.S. Pat. No. 5,675,506, U.S. Pat. No. 5,737,489).