1. Field of the Invention
The present invention relates to a recognition confidence measurement method, medium, and system which can determine whether an input speech signal is an in-vocabulary according to an estimation of a lexical distance between candidates.
2. Description of the Related Art
Generally, in a confidence measurement method, a rejection due to recognition error associated with rejection of an out-of-vocabulary is handled with high priority so as to improve convenience with respect to a speech recognizer. To determine such a rejection due to a recognition error, a process of extracting a predetermined number of candidates which are determined to be similar to an input speech signal is required.
FIG. 1 is a diagram illustrating an example of extracting a candidate in a speech recognition system according to a conventional art.
As shown in FIG. 1, the conventional speech recognition method detects feature information from an input speech signal and extracts candidates using the detected feature information and acoustic knowledge. Namely, the conventional speech recognition method replaces a feature vector string, which is extracted from an input speech signal, with a lexical tree. Also, the conventional speech recognition method extracts a larger number of candidates which are determined to be adjacent to the input speech signal, through a phoneme comparison with all vocabularies in a lexical search network.
In the conventional confidence measurement method, since a lexical area to be searched is extremely wide, hardware resources may be needlessly consumed. Also, in the conventional speech recognition method, a value which is extracted per each unit time domain of a feature vector is utilized to determine a candidate. Accordingly, a speech recognition speed is slow, which is not suitable for embedded, large-capacity high speed speech recognition.
When a candidate can be extracted using only a pronunciation string, not constructing a vocabulary search network in the structure of a conventional lexical tree, consumption of hardware resources may be reduced. Also, when a candidate is detected based on a pronunciation string of a speech signal, rapid speech recognition may be possible.
Accordingly, a new confidence measurement model which can improve user convenience and also rapidly recognize a speech by calculating a similarity between a pronunciation string of a speech and a pronunciation string of a vocabulary for recognition and extracting a minimum number of candidates is needed.