1. Field of the Invention
The present invention relates to a speech recognition method and apparatus for recognizing input speech, and a computer-readable memory.
2. Description of the Related Art
In a speech recognition technique, standard patterns (words or sentences) of standard input information are registered in some form in advance, and speech recognition is performed by comparing the registered standard patterns with an input utterance. Registration forms include, for example, forms using a phonemic expression and generative grammar. In speech recognition, scores representing the similarity between input speech and the standard patterns are determined, and a standard pattern exhibiting the highest score is determined as a speech recognition result.
As a method of inputting speech to be subjected to speech recognition, a method of inputting speech by separating an utterance into syllables of the speech is available. When, for example, xe2x80x9ckanagawaxe2x80x9d is to be input, the user separately utters the respective syllables, like xe2x80x9cka, na, ga, waxe2x80x9d. This input method is called single syllable articulation.
In speech recognition for speech input by single syllable articulation, the following two methods have been used.
1. Speech recognition is performed for speech information obtained by removing periods regarded as silence periods from input speech.
2. Patterns input by single syllable articulation are also registered as patterns to be subjected to speech recognition, and speech recognition is performed, including speech recognition for each pattern.
According to method 1, periods regarded as silence periods are removed from input speech, and speech recognition is performed for the speech information obtained by connecting the remaining periods of speech (FIG. 7).
According to method 2, when the input speech is xe2x80x9ckanagawaxe2x80x9d, not only the pattern xe2x80x9ckanagawaxe2x80x9d but also the pattern xe2x80x9cka (silence period) na (silence period) ga (silence period) waxe2x80x9d are registered as standard patterns. When the highest score is obtained between the input speech and the standard pattern registered as xe2x80x9cka (silence period) na (silence period) ga (silence period) waxe2x80x9d, xe2x80x9ckanagawaxe2x80x9d is used as the speech recognition result.
The following problems are posed in the above speech recognition.
First, in method 1, erroneous determination of voiced/silence periods adversely affects the recognition result. To accurately determine whether given speech is silence, processing similar to speech recognition is required. In this case, problems similar to those posed in method 2 arise.
In method 2, two types of standard patterns, i.e., a pattern input by single syllable articulation and a pattern input by the other method, must be registered for each input speech. This leads to a large processing amount. In general, the recognition rate is often low in the environment at the beginning of a word (immediately after an silence period). In single syllable articulation, each syllable exists in the environment at the beginning of a word, and the reliability of the recognition result is low. There is another problem. In many cases, speech recognition is executed together with speech segmentation processing of automatically detecting the start and end points of an utterance. In single syllable articulation, the presence of an silence period between syllables tends to cause a speech segmentation error, i.e., erroneously recognizing an silence period inserted in a word as the end of an utterance. When such a speech segmentation error occurs, the probability of an accurate speech recognition result obtained by speech recognition for the speech segment is low.
The present invention has been made in consideration of the above problem, and has as its object to provide a speech recognition method and apparatus which can recognize speech with high efficiency and accuracy, and a computer-readable memory.
In order to achieve the above object, a speech recognition apparatus according to the present invention determines whether speech separately uttered as a single syllable is included in the input speech by comparing a first score calculated by comparing an arbitrary syllable sequence with the input speech and a second score calculated by a speech recognition result of the input speech and prompts the user to input speech again by speaking continuously without any pause on the basis of the determination result.
A speech recognition method is also presented according to the present invention where it is determined whether speech separately uttered as a single syllable is included in the input speech by comparing a first score calculated by comparing an arbitrary syllable sequence with the input speech and a second score calculated by a speech recognition result of the input speech and the user is prompted to input speech again by speaking continuously without any pause on the basis of the determination result.
In an alternative method of recognizing input speech, a speech input is received from a speaker and a first score of a single syllable sequence using a speech recognition algorithm and a second score of a speech recognition result for the input speech are calculated. The first and second scores are compared and the speech recognition result is output when the second score is larger than the first score.
In order to achieve the above object, a computer-readable memory according to the present invention includes a program code for recognizing input speech for configuring a system to determine whether speech separately uttered as a single syllable is included in the input speech by comparing a first score calculated by comparing an arbitrary syllable sequence with the input speech and a second score calculated by a speech recognition result of the input speech and prompt the user to input speech again by speaking continuously without any pause on the basis of the determination result.
Other features and advantages of the present invention will be apparent from the following description taken in conjunction with the accompanying drawings, in which like reference characters designate the same or similar parts throughout the figures thereof.