1. Field of the Invention
The present invention relates to a technology that determines a speech segment included in an input signal.
2. Description of Related Art
In related art, in order to determine whether or not a speech signal is included in an input signal, the power of the signal is mainly used to determine a speech segment. The power of the signal is the time average of the square of the amplitude of the signal. However, when the level of the signal itself varies, it is difficult to accurately determine the speech segment based on the power of the signal. The level of the signal indicates the scale of the signal.
To address this, a method for determining a speech segment using spectral entropy that can be obtained based on an input signal is disclosed in the following document: J. Shen, J. Hung, and L. Lee, “Robust entropy-based endpoint detection for speech recognition in noisy environments”, ICSLP-98, 1998.
However, when non-stationary noise, in which a power spectrum of a noise component varies with time, is included in the input signal, it is difficult to accurately determine the speech segment in real time.