In voice recognition techniques in which a vocal section to be a target of voice recognition is detected from a vocal sound of a person to be recognized, and a word uttered in the vocal section is recognized, some of the techniques for suppressing influence of noise have been known.
For example, as a first technique, a technique has been known in which a threshold value of voice power to be used for determination of a vocal section is adaptively changed so that noise is not mistakenly detected as a vocal section in order that only a vocal sound of a person to be recognized is detected as a vocal section.
Also, for example, as a second technique, a technique has been known in which word matching is performed using normalized power of a vocal sound of a person to be recognized so that misrecognition caused by noise is suppressed.
Also, for example, as a third technique, a technique has been known in which word matching is performed using a ratio of vowel to consonant in a vocal section so that that misrecognition caused by noise is suppressed.
Also, a technique enabling to exclude influence of non-stationary noise on estimation has been known in which a noise level in an audio signal is estimated based on power information related to a partial distribution taken out from the power distribution of the signal in accordance with a maximum frequency power in a power distribution of the signal.
Also, a technique enabling to detect an optimum voice section has been known in which a plurality of sets of threshold values at the time when a parameter used for detecting a vocal section is obtained from an input signal are provided, and an optimum set of threshold values is selected in accordance with a signal-to-noise ratio of the input signal.