Along with recent advancement of digital signal processing techniques, outdoor voice communication with mobile phones, hands-free voice communication in cars, and hands-free operation with voice recognition are widely available. Since those apparatuses are often used under high-noise environments, background noise is input to a microphone together with voice. This situation brings deterioration of a quality of voice communication and a voice recognition rate. In order to achieve highly accurate voice recognition and comfortable voice communication, a noise suppression device for suppressing the background noise mixed with the input signal is required.
An example of conventional noise suppression method is disclosed in, for example, Non-Patent Literature 1. The conventional method includes converting an input signal of time domain into power spectra which is a signal of frequency domain, calculating a suppression amount for noise suppression using power spectra of the input signal and estimated noise spectra that is estimated separately from the input signal, performing amplitude suppression of the power spectra of the input signal using the suppression amount, converting the amplitude-suppressed power spectra and the phase spectra of the input signal into time domain, and obtaining a noise suppression signal.
According to the conventional noise suppression method, the suppression amount is calculated based on the ratio of the voice power spectra to the estimated noise power spectra (SN ratio). However, when the suppression amount indicates a negative value (in decibel), a correct suppression amount cannot be obtained. For example, in a voice signal overlaid with a car cruising noise having a high power in a low frequency region, the low frequency region of voice is buried in the noise. In this case, the SN ratio becomes negative, and as a result, there is a problem in that the low frequency region of the voice signal is excessively suppressed to cause voice quality degradation.
In order to solve the foregoing problem, a conventional method for generating and recovering a low frequency region signal that has been lost is disclosed in, for example, Patent Literature 1. This conventional art discloses a voice signal processing apparatus that extracts some of harmonics components of a fundamental frequency (pitch) signal of voice from an input signal, generates subharmonics components by multiplying the extracted harmonics components by two, and overlays the obtained sub-harmonics components on the input signal, thus obtains a voice signal of which voice quality has been improved. By placing the voice signal processing apparatus in a stage subsequent to a noise suppression device, the noise suppression device having superior low frequency region components can be achieved.