This invention relates to a speech recognition device for use in primarily recognizing an input speech signal and in additional for producing a reject signal indicative of a portion the input speech signal, which cannot be recognized.
The input speech signal represents typically a sequence of connected words as an input pattern. It is known in the art that the speech recognition device comprises a similarity measure calculating unit for calculating similarity measures between the input pattern and a plurality of prepared reference patterns and for selecting a maximum value of the similarity measures as a sole similarity measure which is herein called a provisional similarity measure. The prepared reference patterns may be either preliminarily stored in the similarity measure calculating unit or given by concatenations of selected units of recognition units which are, for example, phonemes, syllables, and/or isolated words, which are memorized in a recognition unit memory, and are concatenated into the concatenations by the similarity measure calculating unit. The above-mentioned part of the input speech signal may therefore be a part of the recognition units.
When produced by a conventional speech recognition device, the provisional similarity measure is strongly dependent on circumstances under which the input pattern is produced. The circumstances may be a difference between speakers of, for example, the connected words. On recognizing the input pattern as a whole, the provisional similarity measure should be greater than a predetermined threshold value. Otherwise, at least a part of the input pattern represents a recognition unit which is unknown to the speech recognition device, such as an unknown word. In this instance, the speech recognition device should produce the reject signal.
As usual, the similarity measure may represent a dissimilarity, such as a distance, between the input pattern and each of the prepared reference patterns. In this event, a minimum value must be used instead of the maximum value. At any rate, it has been mandatory to select the threshold value in consideration of the circumstances. Otherwise, the speech recognition device has an objectionable reliability.