A noise suppressor (noise suppressing system) is a system for suppressing noise superposed on a desired voice signal, and generally operates so as to suppress noise mixed in the desired voice signal by estimating the power spectrum of a noise component with an input signal converted to a frequency domain, and subtracting this estimated power spectrum from the input signal. The noise suppressor can be also applied to suppress irregular noise by continuously estimating the power spectrum of a noise component. The noise suppressor is, for example, a method which is adopted as a standard for a North American portable phone, and is disclosed in Non-Patent Document 1 (Technical Requirements (TR45). ENHANCED VARIABLE RATE CODEC, SPEECH SERVICE OPTION 3 FOR WIDEBAND SPREAD SPECTRUM DIGITAL SYSTEMS, TIA/EIA/IS-127-1, SEP, 1996), and Patent Document 1 (Japanese Patent Laid-Open No. 2002-204175).
A digital signal obtained by analog-digital (AD) converting of an output signal of a microphone for collecting a sound wave is normally delivered as an input signal to the noise suppressor. A high-pass filter is generally placed between an AD converter and the noise suppressor to mainly suppress a low frequency range component added when collecting a sound in the microphone and when AD-converting the sound. Such a configuration example is, for example, disclosed in Patent Document 2 (U.S. Pat. No. 5,659,622).
FIG. 1 illustrates such a structure in which the noise suppressor of Patent Document 1 is combined with the high-pass filter of Patent Document 2.
A noisy speech signal (a signal in which a desired voice signal and noise are mixed) is delivered to input terminal 11 as a sample value series. A noisy speech signal sample is delivered to high-pass filter 17, and is delivered to frame divider 1 after a low frequency range component thereof is suppressed. It is absolutely necessary to suppress the low frequency range component for maintaining a linearity of the input noisy speech, and realizing sufficient signal processing performance. Frame divider 1 divides the noisy speech signal sample into frames whose unit is a specific number, and transfers the frames to window processor 2. Window processor 2 multiplies the noisy speech signal sample divided into frames by a window function, and transfers the result to Fourier transformer 3.
Fourier transformer 3 Fourier-transforms the window-processed noisy speech signal sample to divide the signal sample into a plurality of frequency components, and multiplex an amplitude value to deliver the plurality of frequency components to estimated noise calculator 52, noise suppression coefficient generator 82, and multiplexed multiplier 16. A phase is transferred to inverse Fourier transformer 9. Estimated noise calculator 52 estimates the noise for each of the plurality of delivered frequency components, and transfers the noise to noise suppression coefficient generator 82. An example of a method for estimating noise is such a method in which a noisy speech is weighted with a past signal-to-noise ratio to be designated as a noise component, and the details are described in Patent Document 1.
Noise suppression coefficient generator 82 generates a noise suppression coefficient for obtaining enhanced voice in which noise is suppressed for each of the plurality of frequency components by multiplying the noisy speech by the estimated noise. As an example for generating the noise suppression coefficient, a least mean square short time spectrum amplitude method for minimizing an average square power of the enhanced voice is widely used, and the details are described in Patent Document 1.
The noise suppression coefficient generated for each frequency is delivered to multiplexed multiplier 16. Multiplexed multiplier 16 multiplies, for each frequency, the noisy speech delivered from Fourier transformer 3 by the noise suppression coefficient delivered from noise suppression coefficient generator 82, and transfers the product to inverse Fourier transformer 9 as an amplitude of the enhanced voice. Inverse Fourier transformer 9 performs inverse-Fourier-transformation by combining the enhanced voice amplitude delivered from multiplexed multiplier 16 and the phase of the noisy speech, the phase being delivered from Fourier transformer 3, and delivers the inverse-Fourier-transformed signal to frame synthesizer 10 as an enhanced voice signal sample. Frame synthesizer 10 synthesizes an output voice sample of the corresponding frame by using the enhanced voice sample of an adjacent frame to deliver the synthesized sample to output terminal 12.