1. Field of the Invention
This invention relates to a speech detecting method for detecting the interval of an input speech in a speech recognition system.
2. Description of the Prior Art
Heretofore, for detecting the interval of an input speech, the power information of the input speech has been principally employed, with the zero-crossing information of the input speech, also being empirically employed. The method employing the zero-crossing information utilizes the fact that the number of times at which the zero axis is crossed is larger in unvoiced consonants having substantial high-frequency components greater than in voiced phones and noise with substantial low-frequency components. However, when the distribution of the number of times of zero-crossing of the unvoiced consonants, the voiced phones and noise is investigated, it is found that the number of times coincide with each other in many parts, and so it is difficult to achieve a high-precision classification by resorting to the number of times of the zero-crossing.
According to the prior-art method described above, it has been difficult to detect, for example, unvoiced consonants (ex "s" and "h") at the starting point and end point of input speech. Therefore, a threshold value has been lowered in order to raise the detection sensitivity. As a result, a problem occurs that a room noise, for example, is deemed input speech and is erroneously detected. Especially in case where the speech is received through a conventional telephone, ambient noise (this includes the room noise etc.) is liable to mix because the telephone has no directivity. It is an important subject to distinguish between input speech and ambient noise.