1. Field of the Invention
The present invention relates to a speech recognition LSI system and, more particularly, to a speech recognition LSI system which can inform an operator of a detection error occurring in a given speech segment.
2. Description of the Related Art
A speech recognition system includes an A-D converer, a speech analyzer, a speech segment detector, a matching circuit, and a speech recognition circuit. The A-D converter converts a speech signal to a digital signal in accordance with the frequency band of the speech signal. The digital signal is input to the speech analyzer, which outputs time-sequential data of the respective frequency band. The data is input to the speech segment detector. The detector detects a speech segment from that time-sequential data. The speech segment, detected by the speech segment detector, is supplied to the matching circuit. The matching circuit compares the speech segment with a large number of registered reference patterns, determines the similarities between the speech segment and the reference pattern data items, and outputs signals representing the similarities. The signals, output by the matching circuit, are supplied to the speech recognition circuit. This circuit processes these signals and outputs a data item representing the reference pattern most similar to the speech segment, as "recognized" data.
However, in the above-described speech recognition system thus arranged, whether or not speech recognition processing is correctly performed depends on whether or not the speech segment is correctly detected. A conventional speech segment detector detects, as a speech segment, any segment of time-sequential data that remains at a level equal to or higher than a reference level for a period longer than a predetermined period. Hence, the detector cannot detect a segment of the time-sequential data,.which is either at too low of a level or which lasts for too short a time period, as a speech segment. Assume an operator utters the word "KITCHEN," such that the first syllable "KI" is too feeble, and the second syllable "TCHEN" is strong enough. In this case, those segments of the data which correspond to "KI" and "TCHEN" are respectively at a level below, and a level above, the reference level. The detector cannot detect "KI" as a speech segment, and thus only detects "TCHEN" as a speech segment. The matching circuit, therefore, compares only the speech segment corresponding to "TCHEN", with the reference patterns. The speech recognition circuit will inevitably recognize the reference pattern data item more similar to "TCHEN" than any other pattern data item, as one which represents the word "KITCHEN."
When the operator notices this recognition error, he or she needs to utter the same word "KITCHEN" again. However, he or she cannot know why the first uttered "KITCHEN" has not been recognized, and utters the word again, in the same way as previously. Consequently, the speech segment detector detects the speech segment corresponding to "TCHEN" but not the speech segment corresponding to "KI". Therefore, the operator cannot help but repeat the word "KITCHEN" until the system recognize this word. Thus, the conventional speech recognition system has insufficient efficiency.