1. Field of the Invention
The present invention relates to a speech recognition method capable of processing input speech including an unknown word, and an apparatus and a computer controlled apparatus therefor.
2. Related Background Art
For processing an unanticipated input (hereinafter called an unknown word), there have mainly been proposed the following two methods:
(1) a method of detecting the unknown word by describing the train of phonemes and syllables as grammar, forming an HMM network according to the grammar, incorporating such network in the grammar for recognition, and multiplying the output probability by a penalty at the recognition; and
(2) a method of studying a word, to be processed as the unknown word, in advance with various data, thereby preparing a garbage model;
and such methods have been used commonly and considered effective.
However, the method (1), though being capable of accepting any series of sounds as the unknown word, requires a considerable amount of Viterbi calculations for processing the unknown word and a considerable memory capacity therefor. Also in N-best speech recognition providing plural candidates of recognition, this method (1) describing the word by the chain of models may provide many candidates of the unknown word with different series of phonemes though this method can basically only provide information that the word is merely unknown for a given unknown word section, so that the N-best output may eventually become meaningless.
On the other hand, the method (2) only requires an increase in the amount of calculations and in the memory capacity corresponding to the ergodic model, and, providing only one candidate of the unknown word for the unknown word section, matches well with the N-best speech recognition generating plural candidates of recognition. It is, however, necessary to study the word, to be processed as the unknown word, by various data in advance, and the speech that does not appear in the data used in such study cannot be accepted.
Also, either method is defective, in consideration of the amount of calculations and the memory capacity required, in requiring the search process (trellis or Viterbi search), the search space (trellis space) therefor, and the special calculation of the output probability such as ergodic model.
Even in the case of the input of language speech information (an unknown word or an unnecessary word) that is other than the language information anticipated for input, the present invention allows the detection of such word, thereby reducing the search space (for example trellis space) for the unknown words and the memory therefor. As a result, there can be realized speech recognition of high performance with a function to process the unknown words, in a compact manner.